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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 



1. TECHNICAL HELD 

The present invention provides novel polynucleotides and proteins encoded by such 
polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research melhods. 



2. BACKGROUND 

Technology aimed at the discovery of protem factors (including e.g., cytokines, such as 
10 lympholdnes, interferons, CSFs, chemoldnes, and interleuldns) has matured rapidly over the past 
decade. Tlie now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "directly" in the sense that they rely on mformation directly related to the 
discovered protein (i.e.. partial DNA/amino acid sequence of the ptotein in the case of 
hybridization clomng; activity of the protein in the case of expression cloning). More recent 
15 "indirect" cloning t^hniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now weU-recognized secretory leader sequence motif, as well as 
various PCR-based or low stringency hybridization-based cloning techniques, have advanced the 
• state of tiie art by making available large numbers of DNA/amino acid sequences for proteins ■ 
- t^ta^ lmown to have biological activity, for example, by virtue of their secrejedn^^^ .. 
case of leader sequence cloning, by virtue of tiieir cell or tissue source in tiieicase of PCR-based 
techniques, or by virtue of structural similarity to otiier genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous appUcations in, for ' 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other tiaits, to assess biodiversity, and to produce many other typ^ of data 
25 and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

Thecompositionsofthepresentinventionmcludenovelisolatedpolypeptides,novel 
isolatedpolynucIeotidesencodingsuchpoIypeptides,includingrecombinantDNAmolecuIes, 
30 clonedgenesordegeneratevaiiantstiiereo^especiaUynaturaByoccuiTingvaiiantssuchas^^^^ 
variants, antisensepolynucleotidemolecdes, and antibodies tiiatspecificallyi^cognize one or 
epitopespresenton suchpolypejjtides,as well as hybridomas producing such antibodies. 

ITiecompositionsof the present invention additionaUy include vectors, includmgexpress^^^ 
vectors, contaming tiie polynucleotides of the invention, cells genetically engineered to contain such 
35 polynucleotidesand cells geneticaUy engineered to express such polynucleotides. 
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The present invention relates to a collection or library of at least one novel nucleic acid 
sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 
hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 
The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 
5 diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 
sequences are designated as SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. The 
polypeptides sequences are designated SEQ ID NO: 985-1 968, 2953-3936, 3943-3948 or 3955- 
3960. The nucleic acids and polypeptides are provided in the Sequence Listing. In the nucleic adds 
provided in the Sequence Listing, A is adenosine; C is cytosine; G is guanine; T is thymine; and N 
1 0 is any of the four bases. In the amino acids provided in the Sequence Listing, * corresponds to the 
stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridize to the complement of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 under 
stringent hybridization conditions; nucleic acid sequences which are allelic variants or ^ecies 
15 homologues of any of the nucleic acid sequences recited above, or nucleic acid sequences that 
encode a peptide comprising a specific domain or truncation of the peptides encoded by SEQ ID 
NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. A polynucleotide comprising a nucleotide 
sequence having at least 90% identity to an identifying sequence of SEQ ID NO: 1 -984, 1 969-2952, 
-r-iJa^*;-^^' - :i 3937-3 942 or 3949-3954 or a degenerate variant or fr^gmen^ttherepf^ Theidentif^g^^^^^ 
" ~T; ' 20' be 100 base pairs in length. " *' . ■ .. -v..-.^.-. 

The nucleic acid sequences of the present invention also include the sequence information 
fi:omthenucleicacidsequencesofSEQIDNO:l-984, 1969-2952, 3937-3942 or 3949-3954. The 
sequence infonnationcan be a segment of any one of SEQ ID NO:l-984, 1969-2952, 3937-3942 or 
3949-3954 that uniquely identifies or represents the sequence information of SEQ ID NO: 1-984, 
25 1969-2952, 3937-3942 or 3949-3954. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 
^ a nucleic acid array. In one embodiment, segments of sequence infonnation is provided on a 

nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed 
30 to detect fiiU-matoh or mismatch to the polynucleotide that contains the segment. The collection 
can also be provided in a computer-readablefonnat. 

This invention also includes the reverse or direct complement of any of the nucleic acid 
sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 
3 5 reverse or direct complements) according to the invention have numerous applications in a variety 
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Of techniquesknowD to those skUled in tiie art of molecular biology, such as use as hybridization 
probes, use as primers for PGR, use in an array, use in computer-readablemedia, use in sequencing 
fuU-length genes, use for chromosome and gene mapping, use in the recombinant production of 
protein, and use in the generation of anti-sense DNA or RNA, their chemical analogs and the like. 
5 apreferredembodiment,thenucleicacidsequencesof SEQIDNO:l-984, 1969-2952, 

3937-3942 or 3949-3954 or novel segments or parts of the nucleic acids of flie invention are used as 
primers in expression assays that are well known in the art. In a particularly preferred embodiment, 
the nucleic acid sequences of SEQ IDNO:l-984, 1969-2952, 3937-3942 or 3949-3954 ornovel 
segments or parts of the nucleic acids provided herein are used in diagnostics for identifying 
10 expressedgenes or, as well known in the art and exemplifiedby Volliath et al., Science 258:52-59 
(1992), as expressed sequence tags for physical mapping oftiie human genome. 

The isolatedpolynucleotidesof the invention include, but are not limited to, a 
polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1 -984, 
1969-2952, 3937-3942 or 3949-3954 ; a polynucleotidecomprisingany of the full length protlin 
15 coding sequencesof SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954; and a polynucleotide 
comprising any of the nucleotide sequences of the mature protein coding sequences of SEQ ID 
NO:l-984, 1969-2952, 3937-3942 or 3949-3954. The polynucleotidesofthe present invention also 
include, but are not limited to, a polynucleotide that hybridizes under stringent hybridization 
conditions to (a) the compleinQit ofany one of the nucleotide sequences set forth in SEQ ID NO: 1- 
20 '984.1969-2952,3937-3942tr394'9::3954;(b)anucIeotide^^^^ ' 
amino acid sequences set forth in the Sequence listing; (c) a polynucleotide which is an alleUc 

variantof any polynucleotidesrecitedabove;(d)apolynucleotide which encodesaspecieshomolog 
(e.g. orthologs)of any of the proteins recited above; or (e) apolyaucleotidethatencodesa 
polypeptide comprising a specific domain or truncation of any of the polypeptides comprising an 
25 amino acid sequence set forth in the Sequence Listing. 

The isolated polypeptides of the invention include, but are not limited to. apolypqrtide 
comprismg any of the amino acid sequences set forth in SEQ ID NO: 9Z5-196&, 2953-3936, 3943- 
3948 or 3955-3960, or the correspondingfidl length or mature protein. Polypeptides of the 
invention also include polypeptides vwth biological activity that are encodedby (a) any of the 
polynucleotideshaving a nucleotide sequence set forth in SEQ ID NO: 1-984, 1969-2952, 3937- 

3942 or 3949-3954; or (b)polynucleotidesthat hybridize to the complementof the polynucleoti^ 

of(a)understringenthyhridi2ationconditiom. Biologically or immunologicdiyactivevari 

anyofthepolypeptidesequencesintheSequenceListing,and"substantiaiequivalents"thereof 
(e.g.. with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% amino acid sequence 

35 identity)thatpreferablyretainbiologicaIactivityarealsocontemplated.Thepolypeptidesofthe 
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invention may be wholly or partially chemically synthesized but are preferably produced by 
recombinant means using the genetically engineered cells (e.g. host cells) of the invention. 

The invention also provides compositions comprising a polypeptide of the invention. 
Polypeptide compositions of the invention may fijrther comprise an acceptable carrier, such as a 
5 hydrophilic, e.g., pharmaceutically acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 
the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 

1 0 under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
jfrom the culture or from the host cells. Preferred embodiments include those in which the 
protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 

15 as hybridization probes, use as oligomers, or primers, for PGR, use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 
or RNA, their chemical analogs and the like. For example, when the expression of an mRNA is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
. ,^as hybridization probes to detect the presence of the particular cell or tissue mRNA.in a sample 

20 using, e.^.; '/« hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
expressed sequence tags for identifying expressed genes or, as well known in the art and 
exemplified by VoUrath et al,. Science 258:52-59 (1992), as expressed sequence tags for physical 
mapping of the human genome. 

25 The polypeptides according to the invention can be used in a variety of conventional 

procedures and methods that are currently applied to other proteins. For example, a polypeptide 
of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useM for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 

30 markers, and as a food supplement. 

Methods are also provided for preventing, treating, or amelioratmg a medical condition 
which comprises the step of administering to a mammahan subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutically acceptable carrier. 
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In particular, tiie polypeptides and polynucleotides of the invention can be utUized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 
expression or biological activity. 

The present invention fiirther relates to methods for detecting the presence of the 
5 polynucleotides or polypeptides of the invention in a sample. Such methods can. for exan?)!©. be 
utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 
identification of subjects exhibiting a predisposition to such conditions. The invention provides 
a method for detecting the polynucleotides of the invention in a sample, comprising contacting 
the sample with a compound that binds to and forms a complex with the polynucleotide of 
1 0 interest for a period sufficient to form the complex and mider conditions sufficient to form a 
complex and detecting the complex such that if a complex is detected, the polynucleotide of 
interest is detected. The invention also provides a method for detecting the polypeptides of the 
invention in a sample comprising contacting the sample with a con^mid that binds to and forms 
a complex with tiie polypeptide mider conditions and for a period sufficient to fotm the complex 
and detecting the formation of the complex such that if a complex is formed, the polypeptide is 
detected. 

Hie invention also provides kits comprising polynucleotide probes and/or monoclonal 
antibodies, and optionaUy quantitative standards, for carrying out methods.of the invention. 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 

monitoringtheprogressofpatients,mvolvedinclinicalfrialsforlhetreatme^ 
recited above. 

The invention also provides methods for the identification of compounds that modulate 
(i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 
of the invention. Such methods can be utilized, for example, for the identification of compounds 
that can amelipmte symptoms of disorders as recited hereia. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with {e.g., 
bind to) the polypeptides of the invention. Tlie invention provides a method for identifying I 
compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of tiie invention in a cell for a time sufficient to form a polypeptide/compound 
complex, wherein the complex drives expression of a reporter gene sequence in the ceU; and 
detecting the complex by detecting the reporter gene sequence expression such that if e^^ssion 
of the reporter gene is detected the compound the binds to a polypeptide of the invention is 



identified. 



35 



The metiiods of the invention also provides methods for treatment which involve the 
administi^on of tiie polynucleotides or polypeptides of tiie invention to individuals exhibiting 
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symptoms or tendencies. In addition, the invention encompasses methods for treating diseases or 
disorders as recited herein comprising administering compounds and other substances that 
modulate the overall activity of the target gene products. Compounds and other substances can 
effect such modulation either on the level of target gene/protein expression or target protein 
5 activity. 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions kno^ to one of skill in the art as the polypeptides and 
polynucleotides to which they have homology (set forth in Tables 2 and 9); for which they have 
a signature region (as set forth in Tables 3 and 10); or for which they have homology to a gene 
10 family (as set forth in Tables 4 and 1 1). If no homology is set forth for a sequence, flien the 

polypeptides and polynucleotides of the present invention are useful for a variety of applications, 
as described herein, including use in arrays for detection. 
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4, DETAILED DESCRIPTION OF THE INVENTION 



4.1 DEFINmONS 

It must be noted that as used herein and in the appended claims, the singular forms "a", 
"an" and "the" include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of tihg polypeptide which retain the biologic 
20 and^or immunologic activities of any naturally occurring polypeptide. Accordmg to the * * 

invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 

i 

Likewise "immunologically active" or "immunological activity" refers to the capability of the 
natural, recombinant or synthetic polypeptide to induce a specific irmnune response in 

2S appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 

30 polynucleotides by base pairing. For example, the sequence 5*-AGT-3' binds to the 

complementary sequOTce 3'-TCA-5'. Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
total complementarity exists between the single stranded molecules. The degree of 
complementarity between the nucleic acid strands has significant effects on the efiSciency and 

35 strength of the hybridiTation between the nucleic acid strands. 
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The tenn "embryonic stem ceUs (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 
and continuous source of germ cells for the production of gametes. The temi "primordial gemi 
5 cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 
from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to 
differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES ceUs 
are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these ceUs 
not only populate the germ line and give rise to a plurality of terminallydifferentiated ceUs that 
10 comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which 
modulates the expression of an operably linked ORF or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 
sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 
include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are nucleic add fragments which induce the expression of an 
operably linked ORP in response to a specific regulatory factor or physiological event 

The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 
"oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 
20 • sequence.of these nucMdes."-^ 

origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-like material. In the 
sequences herem A is adenme, C is cytosme, T is thymine, G is guanine and N is A, C, G or T 
(U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 
provided herem is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oUgonucleotide linkers, or 
from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 
acid which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion." or 
"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 
residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 
more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides aid 
most preferably at least about 17 nucleotides. The fragment is preferably less than about 500 
nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 
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nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30. 
• nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 
preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 
nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 
5 be used in polymerase chain reaction (PGR), various hybridization procedures or microarray 
procedures to identify or amplify identical or related parts of mKNA or DNA molecules. A 
fragment or segment may uniquely identify each polynucleotide sequence of the present 
invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 
IDNOs:l-20. 

10 Probes may, for example, be used to determine whether specific mRNA molecules are 

present in a cell or tissue or to isolate sunilar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al. (Walsh, P.S. et al., 1992, PGR Melhods Appl 1 :241-250). They may 
be labeled by nick translation, Klenow fill-in reaction, PGR, or other methods well known in the 
art. Probes of the present invention, their preparation and/or labeling are elaborated in 

15 Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory Manual, Gold Spring Harbor 
Laboratory, NY; or Ausubel, F.M. et al., 1989, Guirent Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 
*^eritirety.'^ • • - ^ 

, . , The^nucleic acid sequences of the present invention also include the sequence , . ^^.^^.^.v.^. .■^■■^ 

20 - ^MbmatiM'frbm nucleic acid sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or ' —'^:^.'^^t:f] 
3949-3954. The sequence information can be a segment of any one of SEQ ID NO: 1-1-984, 
1969-2952, 3937-3942 or 3949-3954 that uniquely identifies or represents the sequence 
infonnation of that sequence of SEQ ID NO:l-984, 1969-2952, 3937-3942 or 3949-3954. One 
such segment can be a twenty-mer nucleic acid sequence because the probability that a twenty- 

25 mer is frilly matched in the human genome is 1 in 300. In the human genome, there are three 
billion base pairs in one set of chromosomes. Because 4^^ possible twenty-mers exist, there are 
300 tunes more twenty-mers tiian there are base pairs in a set of human chromosomes. Using the 
same analysis, the probability for a seventeen-mer to be fully matched in the human genome is 
approximately 1 in 5. When these segments are used in arrays for expression studies, fifteen- 

30 mer segments can be used. The probability that tiie fifreen-mer is fidly matched in the expressed 
sequences is also approximately one in five because pressed sequences comprise less than 
approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 
be a twenty-five mer. The probability that the twenty -five mer would appear in a human genome 

35 with a single mismatch is calculated by multiplying the probability for a fiill match (l-^4^^) times the 
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increasedprobabaity for mismatch at each nucleotide position (3 x 25). The probabiHty that an 
eighteen mer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 
detected in a human genome is approximately one in five. 
5 The term "open reading fiame," ORF, means a series of nucleotide triplets coding for 

amino acids without any tennination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionaUy related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
sequence if the promoter controls the transcription of the coding sequence. WhUe operably 
10 linked nucleic acid sequences can be contiguous and in tiie same reading frame, certain genetic 
elements e.g. repressor genes are not contiguously linked to tiie coding sequence but still contiol 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to tiie capability of a cell to cUfferentiate into a number of 
differentiated cell types tiiat are present in an adult organism. A pluripotent cell is restiicted in its 
15 differentiation capabitily in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or laotein sequence or fiagment tiiereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a stretch of amino 
^ ^ acid r^dues of at least about 5 amino acids, preferably at least about 7 amino acids, more 
preferably at least about 9 amihoacids and most preferably at least about 17 or more amino ' 
acids. The peptide preferably is not greater than about 500 amino acids, more preferably less 
tiian 200 amino acids more preferably less than 150 amino acids and most preferably less flian 
100 amino acids. Preferably die peptide is from about 5 to about 200 amino acids. To be active, 
any polypeptide must have sufficient length to display biological and/or immunological activity.' 

The term "natiirally occurring polypeptide" refers to polypeptides produced by cells tiiat 
have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of tiie polypeptide including, but not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidation and acylation. 

The term "franslated protein coding portion" means a sequence which encodes for tiie frill 
length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence vMch encodes a peptide 
or protein without a signal or leader sequence. The "matiore protein portion" means that portion 
of tiie protein which does not include a signal or leader sequence. The peptide may have been 
produced by processing in tiie cell which removes any leader/signal sequence. The matiire 
protein portion may or may not include tiie initial methionine residue. The metiiionine residue 
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may be removed from the proteia during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 

The term "derivative" refers to polypeptides chemically modified by such techniques as 
5 ubiquitination, labeling (e.g., with radionuclides or various en25anes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 
in human proteins. 

The term "variant"(or "analog") refers to any polypeptide differing from naturally 

10 occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g.^ 
recombinant DNA techniques. Guidance in determining which amino acid residues may be 
replaced, added or deleted without abolishing activities of interest, may be found by comparing 
the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 

15 or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 
substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or vii^d Vectof or ejqpression iii a particular ; . 

20 prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding aflSnities, interchain 
affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacmg one amino acid with 

25 another amino acid having similar structural and/or chemical properties, z. e. , conservative amino 

o 

acid replacements, "Conservative" amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 
nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 
30 neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 
"deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 
amino acids. The variation allowed may be experimentally determined by systematically making 
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insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 
5 can, for example, alter one or more of tiie biological functions or biochemical characteristics of 
tiie polypeptides of tiie invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 
for expression, scale up and the like in the host cells chosen for expression. For example, 
10 cysteine residues can be deleted or substituted witii another amino acid residue in order to 
eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the indicated 
nucleic add or polypeptide is present in the substantial absence of other biological 
macromolecules, e.g., polynucleotides, proteins, and the Uke. In one embodiment, the 
polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of the mdicated biological macromolecules present (but water, 
buffers, and otiier smaU molecules, especially molecules having a molecular weight of less than ' 
1000 daltons, can be present). 

at least one other component (e.g.,-nucldc acid or polypeptide) present with the nud^^^ ^ 
polypeptide in its natural source. In one embodiment, tiie nuddc add or polypeptide is found in 
tiie presence of (if anytiiing) only a solvent, buffer, ion, or otiier component nonnally present in a 
solution of tiie same. The terms "isolated" and "purified" do not encompass nucleic acids or 
polypeptides present in their natural source. 

The temi "recombinant," when used herein to refer to a polypeptide or protein, means 
tiiat a polypeptide or protein is derived from recombinant {e.g., microbial, insert, or mammaUan) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacteria] or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 
defines a polypeptide or protein essentially free of native endogenous substances and 
maccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from tiiose 
expressed in manunalian cells. 

The tenn "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 
or vector, for expressing apolypeptide from a DNA (RNA) sequence. An expression vdiicle can 
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comprise a transcriptional unit comprising an assembly of (1) a genetic element or elemrats 
having a regulatory role in gene e?q)ression, for example, promoters or enhancers, (2) a structural 
or coding sequence vMch is transcribed into mRNA and translated into protein, and (3) 
q)propriate transcription initiation and termination sequences. Structural units intended for use 
S in yeast or eukaryotic expression systems preferably include a leader sequence enabling 
wctracellular secretion of translated protein by a host cell. Alternatively, where recombinant 
protein is expressed without a leader or transport sequence, it may include an anodno terminal 
methionine residue. This residue may or may not be subsequently cleaved from the expressed 
recombinant protein to provide a final product. 

10 The term "recombinant e3q)ression system" means host cells which have stably integrated 

a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 
transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 
express heterologous polypeptides or proteias upon induction of the regulatory elements linked 
to the DNA segment or synthetic gene to be expressed. This term also means host cells which 

15 have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regidatory elements linked to the endogenous DNA segment or gene to The cells ' ' : ' 

20 • " ' The t&m "secreted" includes a protein that is transported across or through a membrane, - • 

including transport as a result of signal sequences in its amino acid sequence when it is expressed 
in a suitable host cell. "Secreted" proteins include v^thout limitation proteins secreted wholly 
(e.g. , soluble proteins) or partially (e.g., receptors) firom the cell m which they are expressed. 
"Secreted" proteins also include without limitation proteins that are transported across the 

25 membrane ofthe endoplasmic reticulum. "Secreted" proteins are also intended to include 

proteins containing non-typical signal sequences (e.g. Int^leukin-l Beta, see Krasney, P.A. and 
Young, P.R. (1992) Cytokine 4(2): 134 -143) and factors released firom damaged cells (e.g. 
Interleukin-1 Receptor Antagonist, see Arend, W.P. et. al. (1998) Aimu. Rev. Inmiunol. 
16:27-55) 

30 Where desired, an expression vector may be designed to contain a "signal or leader 

sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided fi-om 
heterologous protein sources by recombinant DNA techniques. 

The term "stringent" is used to refer to conditions that are commonly understood in the 

35 art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
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to filter-bound DNA in 0.5 M NaHP04, 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65»C. and washing in O.IX SSC/0.1% SDS at 68°C), and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization conditions are 
described herein in the examples. 

In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent 
hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14.base oUgonucleotides), 48»C (for IT-base oHgos). 55»C (for 20.base oligonucleotides), and 
60^0 (for23-base oligonucleotides). 

As used herein, "substantially equivalent" can refer both to nucleotide and amino acid 
sequences, for example a mutant sequence, that varies firom a reference sequence by one or more 
substitutions, deletions, or additions, tiie net efifect of which does not result in an adverse 
functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more tiian about 
35% (i.e., tiie number of individual residue substitutions, additions, and/or deletions in a 
substantiaUy equivalent sequence, as compared to the corresponding reference sequence, divided 
by tiie total number of residues in tiie substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to tiie listed sequence. In one 
embodiment, a substantiaUy equivalent, e.g., mutant, sequence of tiie invention varies from a 
listed sequence by no more tiian 30% (70% sequence identity); in a variation:of tiiis embodiment,- 
by no more flian 25% (75% sequence identity); and in a forther variation of tiiis embodiment, by ' 
no more tiian 20% (80% sequence identity) and in a ftoher variation of tiiis embodiment, by no 
more tiian 10% (90% sequence identity) and in a fiirther variation of tills embodiment, by no 
more tiiat 5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid 
sequences according to tiie invention preferably have at least 80% sequence identity wifli a listed 
amino acid sequence, more preferably at least 85% sequence identity, more preferably at least 
90% sequence identity, more preferably at least 95% sequence identity, more preferably at least 
98% sequence Identity and most preferably at least 98% idenity. SubstantiaUy equivalent 
nucleotide sequences of tiie invention can have lower percent sequence identities, taking into 
account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 
) sequence has at least about 65% identity, more preferably at least about 75% identity, more 
preferably at least about 80% identity, more preferably at least about 85% identity, more 
preferably at least about 90% identity, and most preferably at least about 95% identity, more 
preferably at least 98% and most preferably at least about 99% identity. For tiie purposes of tiie 
present invention, sequences having substantially equivalent biological activity and substantially 
equivalent expression characteristics are considered substantiaUy equivalent. For tiie purposes of 
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determining equivalence, truncation of the mature sequence {e.g., via a mutation which creates a 
spxirious stop codon) should be disregarded. Sequence identity may be determined, e.g., using 
tiie Jotun Hein method (Hein, J. (1990) Methods Enzymol. 183:626-645). Identity between 
sequences can also be determined by other methods known in the art, e.g. by varying 
hybridization conditions. 

The term ^totipotent" refers to the capability of a cell to differentiate mto all of the cell 
types of an adxilt organism. 

The term "transformation" means introducing DNA into a suitable host cell so that the 
DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The 
term "transfection" refers to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 
of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As used herein, an "uptake modulating fragment," UMF, means a series of nucleotides 
which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 
using known UMFs as a target sequence or target motif with the computer-based systems 
described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 
with an appropriate host under appropriate conditions and the uptake of the marker sequence is 
determined. As described above, a UMF will increase the frequency ptuptake of a linked 
marker sequence. - ' 

Each of the above terms is meant to encompass all that is described for each, imless the 
context dictates otherwise. 

4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 

The isolated polynucleotides of the invention inclxide a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954; a 
polynucleotide encoding any one of the peptide sequences of SEQ ID NO: 985-1968, 2953-3936, 
3943-3948 or 3955-3960; and a polynucleotide comprising the nucleotide sequence encoding the 
mature protein coding sequence of the polypeptides of any one of SEQ ID NO: 985-1968, 2953- 
3936, 3943-3948 or 3955-3960. The polynucleotides of the present invention also include, but 
are not limited to, a polynucleotide that hybridizes under stringent conditions to (a) the 
complement of any of the nucleotides sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 
or 3949-3954; (h) nucleotide sequences encoding any one of the amino acid sequences set forth 
in the Sequence Listing as SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955-3960; (c) a 
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polynucleotide which is an allelic variant of any polynucleotide recited above; (d) a 
polynucleotide which encodes a species homolog of any of the proteins recited above; or (e) a 
polynucleotide that encodes a polypeptide comprising a specific domain or truncation of the 
polypeptides of SEQ ID NO:985-1968, 2953-3936, 3943-3948 or 3955-3960. Domains of 
interest may depend on the nature of the encoded polypeptide; e.g., domains m receptor-like 
polypeptides include hgand-binding, extracellular, transmembrane, or cytoplasmic domains, or 
combinations thereof; domains in immunoglobuhn-like proteins include the variable 
immunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
domains. 

The polynucleotides of the invention include naturally occurring or wholly or partially 
synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDNA. 

The present mvention also provides genes correspondingto the cDNA sequences disclosed 
herein. The corresponding genes can be isolated in accordance with known methods using the 
sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequeiice informationfor identification and/or amplification of genes in 
iPPropriategenoimc of genoniic materials. Further 5' and 3' sequence can 

be obtained using methods knoym in the art. For example, fiill length cDNA or genomic DNA that • 
correspondstoanyofthepolynucleotidesofSEQIDNO: 1-984, 1969-2952, 3937-3942 or 3949- 
3954 can be obtained by screening appropriate cDNA or genomic DNA libraries under suitable 
hybridization conditions using any of the polynucleotides of SEQ ID NO: 1-984, 1969-2952, 3937- 
3942 or 3949-3954 or a portion tiiereofas a probe. Altematively, the polynucleotides of SEQ ID 
NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 may be used as tiie basis for suitable piimer(s) 
that allow identification and/or amplification of genes in appropriate genomic DNA or cDNA 
libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and sequences 
(including cDN A and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 
representative fragment or segment informationfor novel segment information for the ftdl-length 
gene. 

The polynucleotides of tiie invention also provide polynucleotides including nucleotide 
sequences tiiat are substantially equivalent to tiae polynucleotides recited above. Polynucleotides 
according to tiie invention can have, e.g., at least about 65%, at least about 70%, at least about 
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75%, at least about 80%, 81%, 82%, 83%, 84%, more typically at least about 85%, 86%, 87%, 
88%, 89%, and more typically at least about 90%, 91%, 92%, 93%, 94%, and even more 
typically at least about 95%, 96%, 97%, 98%, 99%, sequence identity to a polynucleotide recited 
above. 

5 Included v^fhin tibie scope of the nucleic acid sequences of the invention are nucleic acid 

sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, or complements thereof, which 
fragment is greater than about 5 nucleotides, preferably 7 nucleotides, more preferably greater 
than 9 nucleotides and most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 

10 20 nucleotides or more that are selective for (i.e. specifically hybridize to any one of the 

polynucleotides of the invention) are contemplated. Probes enable of specifically hybridizing to 
a polynucleotide can differentiate polynucleotide sequences of the invention fix>m other 
polynucleotide sequences in the same family of genes or can differentiate human genes from 
genes of other species, and are preferably based on unique nucleotide sequences. 

1 5 The sequences falling within the scope of the present invention are not limited to these 

specific sequences, but also include allelic and species variations thereof. Allelic and species 
variations can be routinely determined by comparing the sequence provided SEQ ID NO: 1 -984, 
' ' 1969-2952, 3937-3942 or 3949-3954, a representative fragment thereof, or a nucleotide sequence at ' ' ' 
. : -least 90% identical, preferably 95% identi^^ 1-984, 1969-2952, 3937-3942 or , ,^:^-m^•^•m'^'<^ 

20 3949-3954 vsdtihi a sequence from another isolate of the seme species. Furfhennore, to acconimodate' - ' /• 
codon variability, the invention includes nucleic acid molecules coding for the same amino acid 
sequences as do the specific ORFs disclosed herein. In other words, in the coding region of an 
ORF, substitution of one codon for another codon that encodes the same amino acid is e;q)ressly 
contemplated. 

25 The nearest neighbor or homology result for the nucleic acids of the present invention, 

including SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, can be obtained by searching a 
database using an algorithm or a program. Preferably, a BLAST which stands for Basic Local 
Alignment Search Tool is used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 
36 290-300 (1993) and Altschul S J. et al. J. Mol. Biol. 21 :403-410 (1990)). Alternatively a 

30 PASTA version 3 search against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 
suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
acid source from the desired species. 
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The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occuning alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 
polynucleotides. 

The nucleic acid sequences of the invention are further directed to sequences which 
encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 
sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 
encoding the amino acid sequence variants are preferably constructed by mutatuig the 
polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 
will typically be modified in series, e,g,, by substituting first with conservative choices {e.g,, 
hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 
choices (e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions 
may be made at the target site, Amino acid sequence deletions generally range from about 1 to 
30 residues, preferably about 1 to 10 residues, and are typically contiguous. .Amino acid 
insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 
hundred or more residues, as well as intrasequence-insertions of single or multiple amino acid 
residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 
preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 
sequences necessary for secretion or for intracellular targetmg in different host cells and 
sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 

In a preferred method, polynucleotides encoding the novel amino acid sequences are 
changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as suJBScient adjacent 
nucleotides on both sides of the changed amino acid to foim a stable duplex on eitiier side of the 
site of being changed. In general, the techniques of site-directed mutagenesis are well known to 
those of skill in the art and this technique is exemplified by publications such as, Edehnan et al., 
DNA 2:1 83 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by ZoUer and Smith, Nucleic Acids Res. 10:6487-6500 
(1982). PGR may also be used to create amino acid sequence variants of the novel nucleic acids. 
When small amounts of template DNA are used as starting material, primer(s) that differs 
slightiy in sequence from the corresponding region in the template DNA can generate the desired 
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amino acid variant PGR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 
gives a polynucleotide encoding the desired amino acid variant 
5 A fiirther technique for generatmg amino acid variants is the cassette mutagenesis 

technique described in Wells et al.. Gene 34:315 (1985); and other mutagenesis techniques well 
known ia the art, such as, for example, the techniques in Sambrook et al., supra, and Current 
Protocols in Molecular Biology^ Ausubel et al. Due to the inherent degeneracy of the gaietic 
code, other DNA sequences which encode substantially the same or a fimctionally equivalent 

1 0 amino acid sequence may be used in the practice of the mvention for the cloning and expression 
of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 

Polynucleotides encoding preferred polypeptide truncations of the invention can be used 
to generate polynucleotides encoding chimeric or fiision proteins comprising one or more 

1 5 domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 
polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 
synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known 
to.those of skill in the art and can include, forr cxpapje^rmethods for^ d^ 

20 conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 
protein coding sequences corresponding to any one of SEQ ID NO: 1-984, 1969-2952, 3937- 
3942 or 3949-3954, or functional equivalents thereof, may be used to generate recombinant 
DNA molecules that direct the e3q)ression of that nucleic acid, or a functional equivalent thereof, 

25 in appropriate host cells. Also included are the cDNA inserts of any of the clones identified 
herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 
nucleotide sequences by well-established recombinant DNA techniques (see Samhrook J et al. 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Usefiil 

30 nucleotide sequences for joioing to polynucleotides include an assortment of vectors, e.g., 

plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
art. Accordingly, the invention also provides a vector including a polynucleotide of the 
invention and a host cell containing the polynucleotide. In general, die vector contains an origin 
of replication fiinctional in at least one organism, convenient restriction endonuclease sites, and a 

35 selectable marker for the host cell. Vectors according to the invention include expression 
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vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be a prokaryotic or eukaiyotic cell and can be a unicellular 
organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
havmg any of the nucleotide sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949- 
3954or a fragment thereof or any other polynucleotides of the invention. In one embodiment, the 
recombinant consti^cts of the present invention comprise a vector, such as a plasmid or viral 
vector, into which a nucleic acid having any of the nucleotide sequences of SEQ ID NO: 1-984, 
1969-2952. 3937-3942 or 3949-3954 or a fragment thereof is inserted, in a forward or reveise ' 
orientation. In the case of a vector comprismg one of the ORFs of the present invention, the 
vector may further comprise regulatory sequences, including for example, a promoter, operably 
linked to IheORF. Large numbers ofsuitable vectors and promoters are known to those of skiU 
in the art and are commercially available for generating the recombinant constructs of the present 
invention. Hie following vectors are provided by way of example. Bacterial: pBs, phagescript, 
PsiX174, pBluescript SK, pBs KS, pNHSa, pNH16a, pNHlSa, pNH46a (Stratagene); pTrc99A, 
pKK223-3,pKK233-3,pDR540,pRrr5 (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, 
PXn, pSG (Stiatagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia). 

The isolated.polynucleotide of the invention may be operably linked to an expression 
. .conlrol sequence ?uch Ji§ the pMT2 or pED expression vectors disclosed in Kaufinan et al., 
-^^leic Acids ReS. l9, MB5.^90 (1991% ^any- 
suitable expression contix)! sequences are known in the art. General methods of expressing 
recombinant proteins are also known and are exemplified in R. Kaufinan, Methods in 
Emymology 185, 537-566 (1990). As defined herein "operably linked" means that the isolated 
polynucleotide of tiie invention and an expression control sequence are situated within a vector 
or ceU in such a way that the protein is expressed by a host ceU which has been transformed 
(transfected) with the Ugated polynucleotide/expression conti-ol sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 
transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include laci, lacZ, T3, T7, gpt, 
lambda PR, and tic. Eukaryotic promoters include CMV immediate early, HSV thymidine 
kinase, early and late SV40, LTRs from reti:ovirus, and mouse metallothionein-I. Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 
Generally, recombinant expression vectors will include origins of replication and selectable 
markers permitting tiansformation of the host cell, e.g„ the ampicillin resistance gene of ^ coli- 
andS cerevisiaeTKPl gene, and a promoter derived from a highly-expressed gene to direct 
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transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid 
phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
S preferably, a leader sequence capable of directing secretion of translated protein into the 

periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 
characteristics, e,g, , stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors for bacterial use are constructed by inserting a structural DNA 

10 sequence encoding a desired protein together wilh suitable translation initiation and termination 
signals in operable reading phase with a functional promoter. The vector will comprise one or 
more phenotypic selectable markers and an origin of replication to ensure maintenance of the 
vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for 
transformation include E. coli^ Bacillus subtilis^ Salmonella typhimurium and various species 

15 within the genera Pseudomonas, Streptomyces, and Staphylococcus^ although others may also be 
employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
can comprise a selectable marker and bacterial origin of replication derived from commercially^ ' ' • ' 
iUv^^^tsf^ayailable^pla^ genetic elements of the well known cloning yector p.BR322 

^'l^'^XX^ vectors include, for example, pKK223-3 '(Pharmacia Fine ^ r'' - T;"*--;^: 

Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections are combined with an appropriate promoter and the structural 
sequence to be expressed. Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 

25 appropriate means {e.g. , temperature shift or chemical induction) and cells are cultured for an 
. additional period. Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 

Polynucleotides of the invention can also be used to induce immune responses. For 
example, as described in Fan et aL, Nat Biotech. 17:870-872 (1999), incorporated herein by 

30 reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 
sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 
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43ANTISENSE 

Anofhesr aspect of tbe invention pertains to isolated antisense nucleic acid molecules that 
are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence ofSEQE) NO: 1-984, 1969-2952. 3937-3942 or 3949.3954, or fragments, analogs or 
derivatives thereof. An "antisense" nucleic acid comprises a nucleotide sequence that is 
complementary to a "sense" nucleic acid encoding a protein, e.g., complementary to the coding 
strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. In 
specific aspects, antisense nucleic acid molecules are provided that comprise a sequence 
complementary to at least about 10. 25, 50. 100, 250 or 500 nucleotides or an entire coding 
strand, or to only a portion thereof. Nucleic acid molecules encoding fragments, homologs, 
derivatives and analogs of a protein of any of SEQ ID NO: 985-1968, 2953-3936. 3943-3948 or 
3955-3960 or antisense nucleic acids complementary to a nucleic acid sequence of SEQ ID NO: 
1-984. 1969-2952, 3937-3942 or 3949-3954 are additionally provided. 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 
of the coding sti:and of a nucleotide sequence of tiie invention. The term "coding region" refers 
to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, tiie antisense nucleic acid molecule is antisense to a 
"noncodii^ region" of tiie coding strand of a nucleotide sequence of flie invention. The tenn 
"noncoding region" refers to 5' and 3' sequences which flank flie coding region that are not 
■■ -20 - tianslated mto amino acids (le., also referred to as 5' and 3' untranslated regiOHs); - -. 

Given flie coding strand sequences encodmg a nucleic acid disclosed herein (e.g., SEQ ID 
NO: 1-984, 1969^2952. 3937-3942 or 3949-3954), antisense nucleic acids of the invention can be 
designed according to die rules of Watson and Crick or Hoogsteen base pairing. The antisense 
nucleic acid molecule can be complementary to the entire coding region of a mRNA, but more 
preferably is an oUgonucleotide tiiat is antisense to only a portion of tiie coding or noncodmg 
region of a mRNA. For example, tiie antisense oligonucleotide can be complementary to tiie 
region surrounding tiie t^lation start site of a mRNA. An antisense oligonucleotide can be, for 
example, about 5. 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in lengfli. An antisense nucleic 
acid of tiie invention can be constiaicted using chemical syntiiesis or enzymatic ligation reactions 
using procedures known in tiie art. For example, an antisense nucleic acid (e.g., an antisense 
oUgonucleotide) can be chemically syntiiesized using natiirally occurring nucleotides or 
variously modified nucleotides designed to increase tiie biological stability of tiie molecules or to 
increase tiie physical stability of tiie duplex formed between tiie antisense and sense nucleic 
acids, e.g, phosphorotiiioate derivatives and acridine substitiited nucleotides can be used. 
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Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouraci], 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 
2-thiouridine, S-carboxymetbylaniinomethyluracil, dihydrouracil, beta-D-galactosylqueosine, 
5 mosme, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2^-dimethylguanine, 
2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 
7-methylguanine, 5-methylaminomethyluracil, 5-meflioxyaniinomethyl-2-thiouracil, 
beta-D-maimosylqueosine, 5 -methoxycarboxymethyluracil, 5-methoxyuracil, 

2- methyltliio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
10 queosine, 2-thiocytosirie, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 

uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 

3- (3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation (z.e., RNA transcribed from the 

1 5 inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 
described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that tibiey hybridize with or bind to cellular mRNA and/or 
genomic DNA encpding a protein according to the-^inyentipn to the of the 

20 prtftein, e.g. , by inhibiting transcription and/of ira^M^^ The hybridization can be by ' ' " 

conventional nucleotide complementarity to fpnn a stable duplex, or, for example, in the case of 
an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in 
the major groove of the double helix. An example of a route of administration of antisense 
nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 

25 antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modified 
such that they specifically bind to receptors or antigens expressed on a selected cell surfece, e,g,, 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 
receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 

30 the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 
control of a strong pol n or pol in promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 
a-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 

35 double-stranded hybrids with complementary RNA in which, contrary to the usual P-units, the 
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Strands run paraUel to each other (Gaultier et al. (1987) Nucleic Acids Res 15: 6625-6641). The 
antisense nucleic acid molecule can also comprise a 2'-o-methylribonucleotide (Inoue et al. 
(1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et al. (1987) 
FEBS Lett 215: 327-330). 



4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribo2ymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving i 
smgle-stranded nucleic acid, such as a mRNA, to vMch they have a complementary region. 
Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhofifand Geriach (1988) 
Nature 334:585-591)) can be used to catalyticaUy cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
designed based upon the nucleotide sequence of a DNA disclosed herein (i.e., SEQ ID NO: 1- 
984, 1969-2952, 3937-3942 or 3949-3954). For example, a derivative of a Tetrahymena L-19 
IVS RNA can be constructed m which the nucleotide sequence of tiie active site is 
complementary to the nucleotide sequence to be cleaved in a SECX-encoding mRNA. See, e.g., 
Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,1 16,742. Alternatively, 
SECX mRNA can,be used to select a catalytic RNA having a specific ribonuclease activity fiom 
a pool of RNA molecules. See, e.g., Bartel et al., (1993) Science 261 :141 1-1418. 

Altemativelyi gMie^ression can be inhibited by targeting nucleotide sequences 
conqjlementaiy to the regulatory region (e.g., promoter and/or enhancers) to form triple helical 
structures that prevent transcription ofthe gene m target cells. See generally, Helene. (1991) 
Anticancer Drug Des. 6: 569-^^; Helene. etal. (1992) Ann. N.Y.Acad Sci. 660:27-36; and 
Maher (1992) Bioassays 14: 807-15. 

In various embodunents, the nucleic acids ofthe invention can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, e.g, the stability, hybridization, or 
solubility ofthe molecule. For example, the deoxyribose phosphate backbone ofthe nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et al. (1996) Bioorg Med 
Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" or "PNAs" refer to nucleic acid 
mimics, e.g, DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only tiie four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low iomc strength. The synthesis of PNA oligomers can be performed using 
standard solid phase peptide synthesis protocols as described in Hyrup et al. (1996) above; 
Perry-O'Keefe etal. (1996) PNAS 93: 14670-675. 
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PNAs of the invention can be used in therapeutic and diagnostic applications. For 

example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 

gene expression by, e.g.^ inducing transcription or translation arrest or inhibiting replication. 

PNAs of the invention can also be used, e.g. , in the analysis of single base pair mutations in a 

5 gene by, e,g. , PNA directed PGR clamping; as artificial restriction enzymes when used in 

combination with other erczymes, e.g,, SI nucleases (Hymp B. (1996) above); or as probes or 

primers for DNA sequence iand hybridization (Hyrup et aL (1996), above; Perry-O'Keefe (1996), 

above). 

In another embodiment, PNAs of the invention can be modified, e.g. » to enhance their 
1 0 stability or cellular uptake, by attaching lipophilic or other helper groiq)S to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivCTy known in the art. For example, PNA-DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 
enzymes, e,g,, RNase H and DNA polymerases, to interact with the DNA portion while the PNA 
1 5 portion would provide high binding afiinity and specificity. PNA-DNA chimeras can be linked 
using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 
the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 
.V.., ../Qan be performed as described in Hyrup (1996) above and Finned (1996) , 

3357-63. For example, a DNA chdn can be synthesized on a soHd support using standard _ 

'5 f *v<20f^:r^4phosphoramidite coupling chemistry, and modified nucleoside analogs,-e.g-, i ? .ii^^j&^i^-^K f *. 

5 -(4-methoxytrityl)aminO"5'-deoxy-thymidine phosphoramidite, can be used between the PNA 
and the 5' end of DNA (Mag et aL (1989) Nucl Acid Res 17: 5973-88). PNA monomers are then 
coupled in a stepwise maimer to produce a chimeric molecule with a 5' PNA segment and a V 
DNA segment (Finn et aL (1996) above). Alternatively, chimeric molecules can be synthesized 
25 with a 5' DNA segment and a 3' PNA segment. See, Petersen et aL (1 975) Bioorg Med Chem 
Zer/ 5: 11 19-1 1124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides {e.gy for targeting host cell receptors in vivo), or agents facilitating transport across the 
cell membrane (see, e.g., Letsinger etal, 1989, Proc. Natl Acad Sci US.A: 86:6553-6556; 
30 Lemaitre et aL, 1987, Proc. Natl Acad Sci 84:648-652; PCT Publication No. W088/098 10) or 
the blood-bram barrier (see, e.g., PCT Publication No. W089/10134). In addition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Krol et . 
al, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g.. Ion, 1988, Pharm. Res, 
5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a 



24 



wo 01/57190 PCT/USOl/04098 
peptide, a hybridizatioa triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 



4.5 HOSTS 

The present invention further provides host cells genetically engineered to contain the 
polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
methods. The preset invention still furth^ provides host cells genetically engineered to e^qxress 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 
with a regulatory sequence heterologous to the host cell which drives ejcpression of the 
polynucleotides m the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombmation) to provide mdreased polypeptide ejqpression by replacing, m v^ole or in part, the 
naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operatively linked to the encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 
Publication No. WO91/09955. It is also contemplated that, in addition to heterologous promoter 
^ t>NA, amplifiable marker DNA (e.g., ada, dhfi; and the multifuiicti&rial C^%ene which " * ' 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding 
sequence, amplification of the marker DNA by standard selection methods results m co- 
amplification of the desired protein coding sequences in the cells. 

The host cell can be a higher eukaryotic host cell, such as a mamTngl^gn cell, a lower 
eukaryotic host cell, such as a yeast cell, or the host cell can be aprokaiyotic cell, such as a 
bacterial cell. Introduction of the recombmant construct into the host cell can be effected by 
calcium phosphate transfection, DEAE, dextran mediated transfection, or electropoiation (Davis, 
L. et al., Basic Methods in Molecular Biology (1986)). The host cells containmg one of the 
polynucleotides' of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated firagment (in the case of an ORF) or can be used to produce a 
heterologous protein under the control of the EMF. 

Any host/vector system can be used to e>q)ress one or more of the ORFs of the present 
invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, 
COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis. 
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The most preferred cells are those which do not normally express the particular pol5^eptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 
5 RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 
expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, et 
al., in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
York (1989), the disclosinre of which is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express recombinant 

10 protein. Examples of manunalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the CI 27, monkey COS cells, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Colo205 cells, 3T3 
cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 

15 from m vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 

HL-60, U937, HaK or Jurkat cells. Mammalian expression vectors v^U comprise an origin of 
replication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation 
site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example, 

20 SfV40 origin, early promoter,' enhancer, i^li^'mi'p^^^^ sites may be used to provide 

the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 
more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 

25 Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. 

Altematively, it may be possible to produce the protein m lower eukaryotes such as yeast 
30 or insects or in prokaryotes such as bacteria. Potentially suitable yeast jstrains include 

Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 
strain capable of ejq)ressing heterologous proteins. If the protein is made in yeast or bacteria, it 
35 may be necessary to modify the protein produced therein, for example by phosphorylation or 
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glycosylation of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 
negative regulatory elements, transcriptional initiation sites, regulatory protein bmding sites or 
combinations of said sequences. Alternatively, sequences vs*ich affect the structure or stabihty 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targetmg. These sequence include polyadenyMon signals, mRNA stability elements, splice 
sites, leader sequences for enhancing or modifying transport or secretion properties of the 
protein, or other sequences vMch alter or unprove the function or stability of protein or KNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the • 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 
of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. ' " 
Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 
the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 
sequences are added. In all cases, the identification of the targeting event may be faciHtated by 
the use of one or more selectable marker genes that are contiguous with the targeting DNA, 
allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the properly of negative selection, such that the negatively 
selectable marker is linked to the exogenous DNA, but configured such that the negatively 
selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host ceU genome does not result in the stable 
integration of the negatively selectable marker. Markers usefiU for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 
phosphoribosyl-transferase (gpt) gene. 
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The gene targeting or gene activation techniques which can be used in accordance with 

this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 

Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; International Application No. 

PCTAJS92/09627 (WO93/09222) by Selden et al.; and International Application No. 

5 PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by reference 

herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 

10 comprising: the amino acid sequences set forth as any one of SEQ ID NO: 985-1968, 2953-3936, 
3943-3948 or 3955-3960 or an amino acid sequence encoded by any one of the nucleotide 
sequences SEQ ED NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 or the corresponding full 
length or mature protein. Polypeptides of the invention also include polypeptides preferably with 
biological or immunological activity that are encoded by: (a) a polynucleotide having any one of 

15 the nucleotide sequences set forth in SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 or 
(b) polynucleotides encoding any one of the amino acid sequences set forth as SEQ ID NO: 985- 
1968, 2953-3936, 3943-3948 or 3955-3960 or (c) polynucleotides that hybridize to the 
.r < ; complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions.^ - : 
The invention also provides biologically active or immunologically active variants of any of the 
rau^0^>^;^A^^3ijiof as SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955^396#l^^^^i!^-^^-*-^- 

or the corresponding full length or mature protein; and "substantial equivalents" thereof (e.g., at . 
least about 65%, at least about 70%, at least about 75%, at least about 80%, 81%, 82%, 83%, 
84%, more typically at least about 85%, 86%, 87%, 88%, 89%, and more typically at least about 
90%, 91%, 92%, 93%, 94%, and even more typically at least about 95%, 96%, 97%, 98%, 99%, 

25 sequence identity that retain biological activity. Polypeptides encoded by allelic variants may 
have a similar, increased, or decreased activity compared to polypeptides comprising SEQ ID 
NO: 985-1968, 2953-3936, 3943-3948 or 3955-3960. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 

30 be in linear form or they may be cyclized using known methods, for example, as described in H. 
U, Saragovi, et al., Bio/Technology 1 0, 773-778 (1992) and in R. S. McDowell, et al., J. Amer. 
Chem. Soc. 1 14, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fcBgments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 
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The present invention also provides both ftdl-length and mature fonns (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 
sequence is identified in the sequence listing by translation of the disclosed nucleotide 
sequences. The mature form of such protein may be obtained by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell. The sequence of llie mature form 
of the protein is also determinable from the amino acid sequence of the full-length form. Where 
proteins of the present invention are membrane bound, soluble forms of Ihe proteins are also 
provided. In such forms, part or all of llie regions causing the proteins to be membrane bound 
are deleted so that the proteins are fully secreted from the cell in which they are expressed. 

Protein compositions of the present invaition may fiorther comprise an acceptable carrier, 
such as a hydrophilic, e.g., phaimaceutically acceptable, carrier. 

The present invention further provides isolated polypqptides encoded by liie nucleic acid 
fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 
nucleic acid fragment of the present invention (eg., an ORF) by nucleotide sequence but, due to 
the degenerajgr of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 

A variety of methodologies known in the art can be utilized to obtain any one of the 
isolated polypq)tides or protdns of tiae present invention. At the simplest level, tiie amino acid 
sequence can be syntibesized using conimercialiy available peptide s^th^^ 
synthetically-constructed protem sequences, by virtue of sharing primary, secondary or tertiary 
structural and/or conformational characteristics with proteins may possess biological properties 
in common tiierewilh, including protein activity. This technique is particularly useful in 
producing small peptides and fragments of larger polypeptides. Fragments are useful, for 
example, in gttierating antibodies against the native polypeptide. Thus, they may be employed 
as biologically active or immunological substitutes for natural, purified proteins in screening of 
therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protem. As used herein, a 
cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 
which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing eitha- recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 
or proteins of the present invention. 
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The invention also relates to methods for producing a polypeptide comprising growing a 
culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells are grown. For example, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
5 expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 
culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a fiiU length or mature form of the protein. 

10 In an alternative method, the polypeptide or protebi is purified from bacterial cells which 

naturally produce the polypeptide or protein. One skilled in the art can readily follow known 
methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present uivention. These include, but are not limited to, 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 

15 and immuno-aflSnity chromatography. See, eg., Scopes, Protein Purification: Principles and 
Practice, Springer- Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratory 
Manual; Ausubel et al.. Current Protocols in Molecular Biology, Polypeptide fragments that 
retain biological/immunological activity^ include fragments comprising greater than about 100 
amino acids, or greater than about 200 amino acids,^and fragments that encode specific protein 

20 ^domains. ' ^ r-::::m&y'fi^my^^c -^^^ v;-.'-^ 

The purified polypeptides can be used in in vitro binding assays which are well known in 
the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 

25 activity in in vivo tissue culture or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 
cell/animal death or prolonged survival of flie animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 

30 cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 

specificity of the binding molecule for SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955- 
3960. 

The protein of the invention may also be expressed as a product of transgenic animals, 
e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
35 by somatic or germ cells containing a nucleotide sequence encoding the protein. 
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The proteins provided herein also include proteins characterized by amino add sequences 
similar to tiiose of purified proteins but into which modification are naturally provided or 
defiberately engineered. For example, modifications, in the peptide or DNA sequence, can be 
made by those skiDed m the art using known techniques. Modifications of interest in the protein 
5 sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or rqjiaced with another amino acid to alter the conformation of the 
molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,518,584). Preferably, such 
10 alteration, substitution, replacement, insertion or deletion retains the desired activity of the 

protein. Regions of the protein that are important for the protein fimction can be determined by 
various methods known in the art including the alanine-scanning method which involved 
systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 
15 importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protem fimction may be detrammed by the eMATRK program. 

Other fiagments and derivatives of the sequences of proteins which would be expected to 
retain protein activi^ ui whole or in part and are usefiil for screening or other immunological 
methodologies may also be easily made by those skilled in the art given the disclosures herein. 
20 Such modifications are encompassed by the present inventioa 

The protein may also be produced by operably linkiag the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect ejcpression vectors, and employing 
an msect expression system. Materials and methods for baculovirus/insect cell expression 
systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif, U.S.A. 
!5 (the MaxBat™ kit), and such methods are weU known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), incorporated herein by 
reference. As used herem, an insect ceU capable of expressing a polynucleotide of the present 
invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells under 
0 culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified fi-om such culture (i.e., &om culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 
of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such afBnity resins as concanavalin A-agarose, 
heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
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hydrophobic mteraction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or immunoaffinity chromatography. 

Alternatively, the protein of the invention may also be expressed in a form which will 
facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
5 maltose bindiag protein (MBP), glutafhione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are conomercially available 
from New England BioLab (Beverly, Mass.), Phaimacia (Piscataway, N J.) and Invitrogen, 
respectively. The protein can also be tagged with an epitope and subsequentiy purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is conmiercially 
10 available &om Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g. , silica gel having pendant methyl or other 
aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
15 homogeneous isolated recombinant protein. The protein thus purified is substantiaUy free of 

other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 

• ■ ' The polypeptides of the invention include analogs (variants). This embraces fragnaeritsj ^ 



- ^as well as peptides in which one or more amino acids has been deleted,--inserted, or substituted^'^-''- ^^?4s 



'20 Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or * J .4 . 
modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 
another moiety or moieties, e.g., targeting moiety or another tiierapeutic agent. Such analogs 
may exhibit improved properties such as activity and/or stability. Examples of moieties which 
may be fused to the polypeptide or an analog include, for example, targeting moieties which 

25 provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 
antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well 
as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for example, 
innnunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 

30 steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 
alpha or beta interferon. 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
ANDSIMILAiaTY 
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Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and simUarity are codified in computer 
programs including, but are not limited to, the GCG program package, including GAP 
(Devereux, J., et al.. Nucleic Acids Research 12(1):387 (1984); Genetics Computer Gioup. 
5 University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASIX, FASTA (Altschul, S.F. 
et al., J. Molec. Biol. 215:403-410 (1990). PSI-BLAST (Altschul S.F. et al., Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al., J. Comp. 
Biol., Vol. 6, pp. 219-235 (1999), herem incorporated by reference), eMotif software (NeviU- 
Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 
10 (Sonnhammer et al.. Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by 
reference) and the Kyte-Doohttle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
105-31 (1982), incorporated herein by reference). The BLAST programs are publicly available 
from the National Center for Biotechnology Infoimation (NCBI) and other sources (BLAST 
Manual. Altschul, S., et al. NCB NLM NIH Belhesda, MD 20894; Altschul, S., et al., J. Mol. 
15 Biol. 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 
protein" or "fusion protein" comprises a polypeptide of the mvention operatively Imked to 
- another polypeptide. Within a fusion protein the polypeptide accoidmg to tiie invention can • 
correspond to all or a portion of a protein according to the mvention.- In one onbodunent. a 
fiision protein comprises at least one biologically active portion of a protein according to the 
invention. In another embodiment, a fusion protein comprises at least two biologicaUy active 
portions of a protein accordmg to the mvention. Within the fusion protein, the term "operatively 
linked" is intended to mdicate that the polypeptide accordmg to the mvention and the other 
5 polypeptide are fiised in-fiame to each other. The polypeptide can be fiised to the N-temimus or 
C-terminus. 

For example, in one embodiment a fiision protem comprises a polypeptide accordmg to 
the invention operably linked to the extracellular domain of a second protein. 
In another embodhnent, the fijsion protein is a GST-fiision protein m which the polypeptide 
sequences of the invention are fiised to the C-termmus of the GST (i.e., glutathione 
S-transferase) sequ^ces. 

In another embodiment, the fiision protem is an immunoglobulin fiision protein in which 
the polypeptide sequences according to the invention comprise one or more domains fiised to 
sequences derived fiom a member of the immunoglobuUn protein family. The immunoglobulin 
fiision proteins of the invention can be incoiporated into pharmaceutical compositions and 
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administered to a subject to inhibit an interaction between a ligand and a protein of the invention 
on the surface of a cell, to thereby suppress signal transduction in vivo. The immunoglobulin 
fusion proteins can be used to affect the bioavailability of a cognate ligand. Inhibition of the 
ligand/protein interaction may be useful therapeutically for both the treatment of proliferative 
and differentiative disorders, e,g,, cancer as well as modulating (e.g., promoting or inhibiting) 
cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be used as 
immimogens to produce antibodies in a subject, to purify ligands, and in screening assays to 
identify molecules lhat inhibit the interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of the invention can be produced by standard recombinant 
DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-fiame in accordance with conventional techniques, e.g.^ by employing 
blimt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 
appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesurable joining, and enzymatic ligation. In another embodiment, the fusion gene can 
be synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 
subsequently be aimealed and reamplified to generate a chimeric gene sequence (see, for 
example, Ausubel et al. (eds.) CuRREOT. gROTOCOLS IN MOLECULAR Biology, John Wiley & 
Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiety (e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an e^qpression vector such that the fiision moiety is linked 
in-frame to the protein of the invention. 

4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 
activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the mvention. Delivery of a functional gene encoding polypeptides of the invention to 
appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, 
Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Vemaa, Scientific 
American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of 
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the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 
artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
5 activity in such ceUs. Treated cells can then be introduced in vivo for therapeutic pmposes. 
Alternatively, it is contemplated that in other human disease states, preventing Hie expression of 
or inhibiting the activity of polypeptides of the invention wiU be useful in treating the disease 
states. It is contemplated that antisense therapy or gene therapy could be appKed to negatively 
regulate the expression of polypeptides of the invention. 
10 Other methods inhibiting expression of a Jjrotein include the introductionof antisense 

inolecules to the nudeic acids of thepresentinvetition. their complements, or theirt^at^^ 
sequences,bymethodsknovmintheart Further, the polypeptidesofthe present invention can be 

inh&ited by using targeteddeletionmethods, or the insertion ofanegaliveregul^^^ 
as a silencer, which is tissue specific, 

Ihe present inventionstiUfiirther provides cells geneticaUy engineered /nvfvo to express the 
polynucleotidesof the invention, wherein such polynucleotidesare in operativeassociatio^ 

regulatory sequenceheterologous to the host ceU which drives expressionof the polynucleotides in 

Ihecell. Iliesempthodscanbeusedtoincreaseordecreasetheexpressionofthepolynucleotidesof 
the present invgntion. _ 

Kno\yledgeof DNA sequencesprovidedby the invention allows for modification of cells to • 
. permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by 

homblogousrecombination)toprovideincreasedpolypeptideexpressionbyreplacing,inwholei)r 
in part, the naturally occuiTingpromoter with all or partofaheterologouspromoter so that the cells 

express tiie protein at higher levels, nieheterologouspromoter is inserted in suchamamierthat it is 
operatively linked to the desired protein encoding sequences. See, for example, PCT International 
PublicationNo. WO 94/12650, PCT IntemationalPublicationNo. WO 92/20808, and PCT 
IntemationalPubUcationNo. WO 91/09955. It is also contemplated that, in additionto heterologous 
promoter DNA, amplifiabIemarkerDNA(^.g..ada. dhfr, and the multifimctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase,and dihydroorotase) and/or 
intron DNA may be mserted along with the heteiologouspromoter DNA. If linked to the desired 

proteincodingsequence.amplificationof the marker DNA by standard selectionmethods results in 
co-amphficationof tiie desired protem coding sequences m the cells. 

In another embodimentof the present mvention. cells and tissues may be engmeeredto 

express an endogenousgene comprising thepolynucleotidesof the inventionunder the contrd 
inducibleregulatory elements, in which case the regulatory sequencesofthe endogenous gene may 
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be replaced by homologous recombination. As described herein, gene targeting can be used to 
replace a gene' s existing regulatory region with a regulatory sequence isolated from a different gaie 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaffold-attachmentregions, negative 
5 regulatoiy elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which aSect the structure or stability of the SNA or 
protein produced may be replaced, removed, added, or otherwise modified by targeting. These 
sequences include poly adenylation signals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 

1 0 which alter or improve the function or stability of protein or RNA molecxiles. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gme 
under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element. Altemati vely, the 

1 5 targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 
occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 
' ' ' ' - ' added. In all cases, the identification of the targeting event may be facilitated by the use of one or ' ' / 
. vu: more selectable marker, genes that are contiguous with the targeting DNA, allowing for the selection ^ ♦ . * . . 

20 ■ of ceUs in wWch the exogenousDNA has integrated into the ceU genome. The i^^^ ■ .\ 

targeting event may also be facilitated by the use of one or more marker genes exhibiting the 
property of negative selection, such that the negatively selectable marker is Hnked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 

25 not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 
xantiune-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to Chappel; 

30 U.S. Patent No. 5,578,461 to Sherwin et al.; International Application No. PCT/US92/09627 
(WO93/09222)by Seldenet al.; and International ApplicationNo. PCT/US90/06436 
(WO91/06667) by Skoultchi et al., each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 
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In preferred methods to determine biological fimctions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the genn Une of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals m which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
prepared as described in U.S. Patent No. 5,557,032, incoiporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate Upid metaboUsm. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
PubUcation No. W094/28122, incoiporated herein by reference. 

Transgeaic animals can be prepared v^^erein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 
replacing the homologous promoter to provide for increased protein expression. The homologous 
promoter can be supplemented by insertion of one or more heterologous enhancer elements 
known to confer promoter activation in a particular tissue. . ■< ' 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to express 
polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 
models for studying the in vivo activities of polypeptide as weU as for studying modulators of the 
polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ Hne of animals using homologous recombination [Capecchi, Science 
244: 1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout ammals, preferably non-human mammals, can be 
prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useftd as model systems to 
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identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-himian 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCX 
Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of the 
5 invention promoter is either activated or inactivated to alter the level of expression of the 

polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 
homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
1 0 confer promoter activation in a particular tissue. 

4,10 USES AND BIOLOGICAL ACTIVITY 

The polynucleotides and proteins of the present invention are expected to exhibit one or 
more of the uses or biological activities (including those associated with assays cited herein) 

1 5 identified herein. Uses or activities described for proteins of the present invention may be 

provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 
mechanism underlying the particular condition or pathology will dictate whether the 
polypeptides of the^invention, the polypucje^tides^ptt^ inyention or. modulators (activators or. 

20 inhibitors) thereof would be benel&ciaLto the subject in need .of treatment. Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
polypeptides of the invention (including fidl length protein, mature protein and truncations or 
domains thereof), or compoimds and other substances that modulate the overall activity of the 

25 target gene products, either at the level of target gene/protein expression or target protein 
activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directiy or 
mdirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 

30 helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 
or in one of the other physiological pathways described herein. 

35 4.10J RESEARCH USES AND UTILITIES 
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The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 
protein for analysis, characterization or therapeutic use; as markers for tissues in which the 
coiresponding protein is preferentially expressed (either constitutively or at a particular stage of 
tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 
disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PGR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in the process of discovering other novel polyniicleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 
an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 
polynucleotide encodes a protein which binds or potentiaUy binds to another protein (such as, for 
15 example, in a receptor-Ugand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et al.. Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of 
the binding interaction. 

The polypeptides provided by the present invention can similarly be used in assays to 
D- determme'bibioiica levity, inclu(&^ ... 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 
> development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins involved in these binding interactions can also be used to screen for peptide or smaU 
molecule inhibitors or agonists of the binding interaction. 

Any or all of these research utiUties are capable of being developed into reagent grade or 
kit format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skiUed in the art. 
References disclosing such methods include without limitation "Molecular Qoning: A 
Laboratory Manual", 2d ed.. Cold Spring Harbor Laboratory Press. Sambrook, J., E. F. Fritsch 
and T. Maniatis eds., 1989. and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press. Berger, S. L. and A. R. Kimmel eds., 1987. 
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4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
sources or supplements. Such uses include without limitation use as a protein or amino acid 
supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
such cases the polypeptide or polynucleotide of tibie invention can be added to the feed of a 
particular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
polypeptide or polynucleotide of the invention can be added to the medium m or on which the 
microorganism is cultured. 
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4.10.3 CYTOKINE AND CELL PROUFERATION/DIFFERENTI^ 
ACTIVITY 

A polypeptide of the present invention may exhibit activity relating to cytokine, cell 
proliferation (either inducing or inhibiting) or cell diflFerentiation (either inducing or inhibiting) 

15 activity or may induce production of other cytokines in certain cell populations. A 

polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 

^ or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient ' 



.;r;.->;w^?.^,.^ cytokiuc activity. The activity of therapeutic compositions of the presents 



' 2d ihvehtiorils evidenced by any one of a number of routine factor dependent cell proliferation'". " ' .V^ ; *^ 
assays for cell lines including, without limitation, 32D, DA2, DAIG, TIO, B9, B9/1 1, BaF3, 
MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, T1165, HT2, CTLL2, TF-1, Mo7e, CMK, 
HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 
Assays for T-cell or thymocyte proliferation include without limitation those described 

25 in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Lnmunol. 137:3494-3500, 1986; Bertagnolli et al., J. Immunol. 
145:1706-1712, 1990; Bertagnolli et al.. Cellular Immunology 133:327-341, 1991; BertagnoUi, 

30 et al., 1. Immunol. 149:3778-3783, 1992; Bowman et al., L Immunol. 152:1756-1761, 1994. 

Assays for c3rtokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 
Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immimology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
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and human interleukin-y, Schreiber, R. D. In Current Protocols in Inununology. J. E. e.a. Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
include, without limitation, those described in: Measurement of Human and Murine Inlerleukin 2 
and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current Protocols in 
Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
deVries et al., J. Exp. Med. 173:1205-121 1, 1991; Moreau et al.. Nature 336:690-692, 1988; 
Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Measurement of moule 
and human interleukin 6~Nordan, R. In Current Protocols m Immunology. J. E. CoUgan eds. Vol 
1 pp. 6.6.1-6.6.5, John WHey and Sons, Toronto. 1991; Smith et al., Pioc. NatL Aced. Sci. 
U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 1 l~Bennett, F., Giannotti, J., 
Claik, S. C. and Turner, K. J. In Current Protocols m Immunology. J. E. Coligan eds. Vol 1 pp. 
6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin ' 
9-Ciarletla, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 
15 J. E. CoKgan eds. Vol 1 pp. 6.13.1, John WUey and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-ceU effects by measuring proliferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 
Inmimology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. MarguUes, E. M. Shevach, W Strober, 
J PubV Greene Publishiiag Assodafesiiha^iiey-lntersdenc^'p^ i) In Vitro assays for Mouse' 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, 
Immunologic studies in Humans); Weinberger et al., Proc. Natl. Acad. Sci. USA 77:6091-6095, 
1980; Weinberger et al., Eur. J. Immun. 1 1:405-41 1, 1981; Takai et al., J. Immunol. 
137:3494-3500, 1986; Takai etal., J. Immunol. 140:508-512, 1988. 



4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the prohferation, differentiation and survival of pluripotent and totipotent stem 
cells including primordial germ cells, embryonic stem ceUs, hematopoietic stem cells and/or 
germ line stem ceUs. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state which would be useful for i^-engmeering damaged or diseased tissues, transplantation, 
manufecture of bio-pharmaceuticals and the development of bio-sensors. The abiUty to produce 
large quantities of human ceUs has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, implantation of 
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cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 

tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 

cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 

for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

5 It is contemplated that multiple different exogenous growth factors and/or cytokines may 

be administered in combination with the polypeptide of the invention to achieve the desired 

effect, including any of the growth factors hsted herein, other stem cell maintenance factors, and 

specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt-3 Hgand (Flt- 

3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL-6, macrophage 

10 mflammatory protein 1-alpha (MlP-l-alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic fibroblast 
growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 
these cells in culture will facilitate the production of large quantities of mature cells. Techniques 

1 5 for cuJturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 
with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 

20..., layer for thestem- cell populations in ciilture or in Stromal support cells for feed^ layers . 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to induce 
autocrine expression of the polypeptide of the invention. This will allow for generation of 

25 undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be 
differentiated into the desired mature ceil types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 
identification of differentially e>q)ressed genes in stem cell populations that regulate stem cell 

30 proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations vnil be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
' may be used to manipidate stem cells in culture to give rise to neuroepithelial cells that can be 
used to augment or replace cells damaged by illness, autoimmime disease, accidental damage or 

35 genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
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of neural ceUs and for the regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 
the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem ceUs can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 
ceU type from undifferentiated stem cell populations involves the use of a cell-type specific 
promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into caidiomyocytes (Wobus 
et al.. Differentiation, 48: 173-182. (1991); Klug et al., J. Clin. Invest, 98(1): 216-224. (1998)) 
or skeletal muscle ceUs (Browder. L. W. In: Principles of Tissue Engineering eds. Lanza et al.. 
Academic Press (1997)). Alternatively, directed differentiation of stem cells can be 
15 accomplished by cultiiring the stem cells in the presence of a differentiation fector such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In Vitro cultures of stem ceUs can be used to detennine if tiie polypeptide of the invention 
exhibits stem cell growfli factor activity. Stem ceUs are isolated from any one of various cell 
"6 sourc&(m#dMg-fie]&ato^^^ Stem cell^ and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Nati. Acad. Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of tiie polypeptide of the invention alone or in combination witii other growth 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
proliferation is determined by colony formation on semi-sohd support e.g. as described by 
> Bernstein et al.. Blood, 77: 2316-2321 (1991). 



4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of tiie present invention may be involved in regulation of hematopoiesis 
and, consequentiy, in tiie tieatinent of myeloid or lymphoid cell disorders. Even marginal 
biological activity in support of colony forming ceUs or of factor-dependent ceU lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting tiie growth and proKferation of 
erythroid progenitor ceUs alone or in combination witii otiier cytokines, fliereby indicating utiUty, 
for example, in tieating various anemias or for use in conjunction witii inadiation/chemotiierapy 
to stimulate tiie production of eiyfliroid precursors and/or eiytirroid cells; in supporting tiie 
growtii and proliferation of myeloid ceUs such as granulocytes and monocytes/macrophages (i.e.. 
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traditional CSF activity) useful, for example, in conjunction with chemotherapy to prevent or 

treat consequent myelo-suppression; in supporting the growth and proliferation of 

megakaryocytes and consequently of platelets thereby allowing prevention or treatment of 

various platelet disorders such as thrombocytopenia, and generally for use in place of or 

5 complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 

hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 

hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 

those usually treated with transplantation, including, without limitation, aplastic anemia and 

paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 

10 post irradiation/chemotherapy, either zw-vrvo or ex-vivo (i.e., in conjunction with bone marrow 

transplantation or witiii peripheral progenitor cell transplantation (homologous or heterologous)) 

as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

Suitable assays for proliferation and differentiation of various hematopoietic lines are 

15 cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 

proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 

. : tbosedescribedinrJohanssonetal. Cellular Biology 15:141-151, 1995; KeUeretal.jM^ 

and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915, 1993. 

■ i:.7iit20.^-j:*"-i^ - ^Assays for^tem cell suirvival and differentiation (which .will identify, among btherSjvU'iivrifi).; 
- J- ■ ^ ' 

proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: . . 
Methylcellulose colony forming assays, Freshney, M. G. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et al., 
Proc. Natl. Acad. Sci. USA 89:5907-591 1, 1992; Primitive hematopoietic colony forming cells 

25 with high proliferative potential, McNiece, 1. K. and Briddell, R. A. In Culture of Hematopoietic 
Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben et 
al.. Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 1-21, 
Wiley-Liss, Inc., New York, N.Y. 1994; Loi^ term bone marrow cultures in the presence of 

30 stromal ceUs, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. L 

Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 



35 4.10.6 TISSUE GROWTH ACTIVITY 
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A polypeptide of the present invention also may be involved in bone. cartUage, tendon, 
ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of bums, incisions and ulcers. 

A polypeptide of the present invention which induces cartilage and/or bone growth in 
5 circumstances where bone is not nonnally formed, has appUcation in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as weU as open fracture reduction and also in the improved fixation of 
artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repair 
10 of congenital, trauma induced, or oncologic resection induced craniofecial defects, and also is 
useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming cells, 
sthnulating growth of bone^forming cells, or inducing differentiation of progenitors of 
bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking 
mflammation or processes of tissue destruction (coUagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the composition of the 
invention. . _ 

. Another category of tissue regeneration activity that may .involve the polypeptide of the 
) Pi^ent invention is tendon/ligament-foimation. Induction of 

other tissue formation in circumstances where such tissue is not normally formed, has appUcation 
in tiie healing of tendon or Hgament tears, deformities and other tendon or ligament defects in 
hmnans and other animals. Such a preparation employing a tendon/Ugament-like tissue mducing 
protein may have prophylactic use in preventing damage to tendon or Hgament tissue, as" well as 
use in the improved fixation of tendon or hgament to bone or other tissues, and in repairing 
defects to tendon or Ugament tissue. De novo tendonAigament-like tissue formation induced by 
a composition of the present mvention contributes to tiie repair of congenital, trauma induced, or 
other tendon or Ugament defects of other origin, and is also useful m cosmetic plastic surgery'for 
attachment or repair of tendons or ligaments. The compositions of the present invention may 
provide environment to attract tendon- or Ugament-formmg cells, stimulate growth of tendon- or 
ligament-formmg cells, induce differentiation of progenitors of tendon- or Ugament-forming 
ceUs, or induce growth of tendon/ligament cells or progenitors ex vtvo for return in vivo to effect 
tissue repair. The compositions of the invention may also be useful in tiie treatment of tendinitis, 
carpal timnel syndrome and otiier tendon or Ugament defects. The compositions may also include 
an appropriate matiix and/or sequestering agent as a carrier as is weU known in the art. 
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The compositions of the present invention may also be useful for proliferation of neural 
cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 
nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 
involve degeneration, death or trauma to neural ceils or nerve tissue. More specifically, a 
5 composition may be used in the treatment of diseases of the peripheral nervous system, such as 
peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 
system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 
lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 
accordance with the present invention include mechanical and traumatic disorders, such as spinal 
10 cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 
resulting from chemotherapy or other medical therapies may also be treatable using a 
composition of the inventioiL 

Compositions of the invention may also be useful to promote better or faster closure of 
non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
15 insufficiency, sitrgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 
kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 
. i2G- desired. eflFects^may.be by inMbitioiiLormodiflation of fibrotic-scairing may allow jtiormal tissue: a. 
to regenemte. A polypeptide of the present invention may also exhibit angiogenic activity. 

A composition of the present invention may also be useful for gut protection or 
regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
conditions resulting from systemic cytokine damage. 
25 A composition of the present invention may also be useful for promoting or inhibiting 

differentiation of tissues described above from precursor tissues or cells; or for inhibiting the 
growth of tissues described above. 

Therapeutic compositions of the invention can be used in the following: 

Assays for tissue generation activity include, without Umitation, those described in: 
30 International Patent Publication No. WO95/16035 (bone, cartilage, tendon); Intemational Patent 
Publication No. WO95/05846 (nerve, neuronal); Intemational Patent Publication No. 
WO91/07491 (skin, endotheUum). 

Assays for woimd healing activity include, without limitation, those described in: Winter, 
Epidermal Wound Healing, pps. 71-1 12 (Maibach, H. I. and Rovee, D. T., eds.). Year Book 
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Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 
71:382-84(1978). 

4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or immune 
suppressing activity, includmg without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 
severe combined immunodeficiency (SCID)), e.g., in regulating (up or down) growfli and 
Foliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 
and other ceU populations. These immune deficiencies may be genetic or be caused by vital (e.g., 
HTV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 
specifically, mfectious diseases causes by viral, bacterial, fungal or other mfection may be 
treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, 
heipes vmises, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, protems of the present mvention may also be useful 
viiere a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 
Autounmune disorders 'wWch may be treated using a protein of the present invention 

connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 
rheuiu^loid ■^iBri'ts,%utoinmiWe p^ 

autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, grafl-versus-host 
disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, 
including antibodies) of the present mvention may also to be useful in the treatment of allergic 
reactions and conditions (e.g., anaphylaxis, serum sickness, drug reactions, food allergies, insect . 
venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angioedema, ec2Bma, atopic dermatitis, allergic contact dermatitis, eiythema multiforme. 
Stevens-Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact aUergies), such as asthma 
(particularly allergic asthma) or other resphatoiy problems. Other conditions, in which unmune 
suppression is desked (includmg, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof of the present invention. The therapeutic effects of the 
polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 
models such as the cumulative contact enhancement test (Lastbom et al.. Toxicology 125: 59-66, 
1998). skin prick test (Hofifinann et al., AUergy 54: 446-54, 1999), guinea pig skin sensitization 
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test (Vohr et al.. Arch. ToxocoL 73: 501-9), and murine local lymph node assay (Kimber et al., 
J. ToxicoL Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 
5 immune response already in progress or may involve preventing the induction of an immune 
response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent Tolerance, which involves inducing non-responsiveness or anergy 
10 in T cells, is distinguishable from inmiimosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 
of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 
15 limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin and 
organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell 
' ' * ' function should result iu reduced tissue destruction in tissue transplantation. Typically, in tissue 
- • : • • ,.r. v". . transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, --y . 
^^''^^^'^'-it'l^ fo^ by an immune reaction that destroys^the tranisplantlTfie a of a iii^f apeu^^c^^^ 

coinposition of the invention may prevent cytokine synthesis by immune cells, such as T cellsi 
and thus acts as an inamunosuppressant Moreover, a lack of costimulation may also be sufl5cient 
to anergize the T cells, thereby inducing tolerance in a subject Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
25 of these blocking regents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the fimction of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used mclude allogeneic cardiac grafts in 
30 rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive eiBFects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
al., Science 257:789-792 (1992) and Turka et al., Proc. Nati. Acad. Sci USA, 89:1 1 102-1 1 105 
(1992). In addition, murine models of GVHD (see Paul ed.. Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 
35 compositions of the invention on the development of that disease. 

• 48 



"^^^^'^'^^ PCT/USOl/04098 
Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are liie result of inappropriate activation of T cells that are 
reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 
5 reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
reagents may induce antigen-specific tolerance of autoreactive T cells ^^^ch could lead to 
long-term reKef fi-om the disease. The efScacy of blockmg reagents inpreventing or aUeviating 
1 0 autoimmune disorders can be detennined using a number of well-characterized animal models of 
human autoimmune diseases. Examples include muriae experimental autoimmune encephalitis, 
systemic iMpus erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine autoimmune 
collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 
myasthenia gravis (see Paul ed.. Fundamental Immunology, Raven Press, New York, 1989 pp 
15 840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the fonn of enhancing an existing immune response or eliciting an initial 
injlnune response. For example, enhancmg an immune response may be useful in cases of viral 
10 infection, including systemic viral diseases such as influenza,:the common cold, and encephalitis. 
Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells fiom the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory fonn of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 
5 patient. Another method of enhancing aiiti-viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 
invention as described herein such that the cells express all or a portion of Ihe protein on their 
surface, and reintroduce the transfected cells into the patient The infected cells would now be 
capable of delivering a costimulatoiy signal to, and thereby activate, T cells in vivo. 
) A polypeptide of the present invention may provide the necessary Stimulation signal to T 

ceUs to induce a T cell mediated immune response against the transfected tumor cells. In 
addition, tumor cells which lack MHC class I or MHC class H molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class H molecules, can be transfected with 
nucleic acid encoding all or a portion of (e.g., a cytoplasmic^lomain truncated portion) of an 
MHC class I alpha chain protein and ^ microglobulin protein or an MHC class n alpha chain 
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protein and an MHC class n beta chain protein to thereby express MHC class I or MHC class II 

proteins on the cell surface. Expression of the appropriate class I or class U MHC in conjunction 

with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T 

cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 

5 an antisense construct which blocks expression of an MHC class H associated protein, such as 

the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 

of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 

tumor specific immunity. Thus, the induction of a T cell mediated immxme response in a hxunan 

subject m^y be suflBcient to overcome tumor-specific tolerance in the subject. 

1 0 The activity of a protein of the invention may, among other means, be measured by the 

following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. CoUgan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 
15 Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al,, Proc. Natl. Acai Sci. USA 
78:2488-2492, 1981; Herrmann et aL, J. Immunol. 128:1968-1974, 1982; Handa et al., J. 
IimnunoL 135:1564-1572, 1985; Takai ejal., I. Immunol. 137:3494-3500, 1986; Takaietal., J. 
Immunol. 140:508-512, 1988; Bovmian et d., J. Virology 61:1992-1998; Bertagnolli et al., 
.''- 20 Cellular Immunology 133:327-341rl991vB^^ 153:3079-3092, 1994. ^' *■ 

Assays for T-cell-dependent immunoglobulin responses and isotype switching (which 
will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described Ln: Maliszewski, J. 
Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production, 
25 Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 
that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Cuixent Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
30 M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter.3, 
. In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immimologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 
1988; BertagnoUi et al., L Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed by 
35 dendritic cells that activate naive T-cells) include, without limitation, those described in: Guery 
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et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of Experimental Medicine 
173:549-559, 1991; Macatonia et al.. Journal of Immunology 154:5071-5079, 1995; Porgador et 
al.. Journal of Experimental Medicine 182:255-260, 1995; Nair et al.. Journal of Virology 
67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al.. Journal of 
5 Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al.. Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al.. Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (yMdx will identify, among others, proteins 
that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without Hmitation. those described m: Darzynkiewicz et al.. Cytometry 
10 13:795-808, 1992; Gorczyca et al.. Leukemia 7:659-670, 1993; Gorczyca et al.. Cancer Research 
53:1945-1951, 1993; Itoh et al.. Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 
145:4037-4045. 1990; Zamai et al.. Cytometry 14:891-897, 1993; Gorczyca et al.. International 
Journal of Oncology 1 :639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
15 mclude, without hmitation, those described in: Antica et al., Blood 84:1 1 1-1 17, 1994; Fine et al.. 
Cellular Immunology 155:1 1 1-122, 1994; Galy et al.. Blood 85:2770-2778, 1995; Toki et al., 
Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 
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4.10.8 ACTIVIN/INHIBIN ACTIVITY 

'A polypeptrde^hhe jsresent indention may also exhibit ictivin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibitmg such 
characteristics. Inhibins are characterized by their ability to inhibit the release of foUicle 
stimulating hormone (FSH), while activins and are characterized by their abiUty to stimulate the 
release of foUicIe stimulating hormone (FSH). Thus, a polypeptide of the present invention, 
25 alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 
based on the abiUty of mhibms to decrease fertility m female mammals and decrease 
spermatogenesis in male mammals. Admmistration of sufficient amounts of other inhibins can 
induce infertihty in these mammals. Alternatively, the polypeptide of the invention, as a 
homodimer or as a heterodhner with other protein subunits of the inhibin group, may be liseful as 
30 a fertility inducing therapeutic, based upon the abihty of activin molecules in stimulating FSH 
release from ceUs of the anterior pituitary. See, for example, U.S. Pat. No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertiUty in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 
animals such as, but not limited to, cows, sheep and pigs. 
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The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale et 
al., Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782, 1986; Vale et al.. Nature 
5 321:776-779, 1986; Mason et al.. Nature 318:659-663, 1985; Forage et al., Proc. Natl. Acad. ScL 
USA 83:3091-3095, 1986. 



4.10.9 CHEMOTACTIC/CmMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or chemokinetic 

10 activity for maiiamalian cells, including, for example, mondc3^s, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 

15 modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immxme responses against the tumor or infecting agent. ' . ^ 

r-v - . Aproteinorpeptidehaschemotacticactivity for a particular cell population if it can ,., 
• • ' t20 stimulate, directly or indirectly, the directed orientation or movement of such ce^ 

Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 

25 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, tiiose described m: Current Protocols in Immunology, Ed by J. E. Coligan, A. 

30 M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 
1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 
1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 
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4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or tbrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders (including 
hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
treatment and prevention of conditions resulting therefiom (such as, for example, inferction of 
cardiac and central nervous system vessels (e.g., stroke). 

Therapeutic compositions of the invention can be used in the following: 
Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al.. Thrombosis Res. 
45:413-419, 1987; Humphrey et al.. Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the ^agnosis and/or prognosis of one or more types of cancer. For 
example, the-presence or increased expression of a polynucleotide/polypeptide of the invention 
may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 
Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 
condition. Identificalion of single nucleotide polymorphisms associated with cancer or a 
predisposition to cancer may also be useful for diagnosis or prognosis. 

Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 
inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 
and/or prohibiting metastasis by reducmg tumor cell motility or invasiveness. Ther^eutic 
compositions of the invention may be effective in adult and pediatric oncology including in soUd 
phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 
cancer, including lymphatic metastases, blood cell maUgnancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including smaU cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers includmg esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 
associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
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bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 
carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 
kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 
5 nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 
tumor progression of human skin keratinocytes, squamous cell carcinomei, basal cell carcinoma, 
hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 
ixihibitors and stimulators of the biological activity of the polypeptide of the invention) may be 

10 administered to treat cancer. Hierapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermofherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating tihe cancer. 

1 S The composition can also be administered in therapeutically effective amounts as a 

portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutically 
acceptable carrier for delivery. Th^ use of anti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drags that are well known in the art and can be used as a treatment in combination 
«.20 - with the. polypeptide or modulator. of tiheinyention include: -Actinomycin D, Aminoglutethimide, . 
Asparaginase, Bleomycin, Busulfan, Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclophosphamide, Cytarabine HCl (Cytosine arabinoside), Dacarbazine, Dactinomycin, 
Daunorubicin HCl, Doxorubicin HCl, Estramustine phosphate sodium, Etoposide (VI 6-213), 
Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, 

25 Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mechlorethamine HCl (nitrogen mustard), Melphalan, Mercaptopuiine, Mesna, 
Methotrexate (MIX), Mitomyciu, Mitoxantrone HCl, Octreotide, Plicamycin, Procarbazine HCl, 
Streptozocin, Tamoxifen citrate, Tliioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleuldn-2, Mitoguazone, Pentostatin, 

30 Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically 

35 effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 
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In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potentiaJ cancer treatment These in vitro models include proliferation assays of 
cultured tumor cells, growth of cultured tumor cells in soft agar (see Fieshney, (1 987) Culture of 
Animal CeUs: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 1 8 and Ch 21), 
tumor systems in nude mice as described in GiovaneUa et al., J. Natl. Can. Inst., 52: 921-30 
(1974), mobiUty and invasive potential of tumor cells in Boyden Chamber assays as described in 
Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 
of vascularization of the chick chorioaUantoic membrane or induction of vascular endotheKal 
cell migration as described in Ribatta et al., Intl. J. Dev. Biol., 40: 1 1 89-97 (1999) and Li et al., 
Clin. Exp. Metastasis, 17:423-9 (1999), respectively. Suitable tumor cells lines are available, ' 
e.g. from American Type Tissue Culture CoUection catalogs. 



4.10.12 RECEPTOR/LIGANDACnvrry 

A polypeptide of the present invention may also demonstrate activity as receptor, 
15 receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands mclude, without limitation, cytokine leceptois and their ligands. receptor kinases and 
thpir Hgands, receptor phosphatases and their ligands, receptors mvolved in cell-cell interactions 
and Iheir ligands (including without limitation, cellular adhesion molecules (such as selectins, 
20 integrins an<l.th^::Hgands) and receptor/ligand pairs involved in antigen presentation, antigen 
recognition and development of ceUular and humoral immune responses. Receptors and Ugauds 
are also usefid for screenmg of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of recg)tors and ligands) may themselves be useful as inhibitors of receptor/ligand 
25 interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods: 

Suitable assays for receptor-ligand activity include without limitation those described in: 
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. 
30 Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28, 
Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et al., Proc. 
Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et al., J. Exp. Med. 168:1145-1156, 1988; 
Rosensteinet al., J. Exp. Med. 169:149-160 1989; Stoltenborg et al., J. Immunol. Methods 
175:59-68, 1994; Stitt et al., CeU 80:661-670, 1995. 
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By way of example, the polypeptides of the invention may be used as a receptor for a 
ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identijSed 
through binding assays, affinity chromatography, dihybrid screening assays, BIAcore assays, gel 
overlay assays, or other methods known in the art. 
S Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 

partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 
present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 
colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 
Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990) Academic 
10 Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
carbon-14 . Examples of colorimetric molecules include, but are not lunited to, fluorescent 
molecules such as fluorescamine, or rhodamine or other colorimetric molecules. Examples of 
toxins include, but are not limited, to ricin. 



4.10.13 DRUG SCREENING 

This invention is particularly useful for screening chemical compounds by using the 
novel polypeptides or binding fi-agments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
solid support, borne on a cell surface or located intracellularly, jOne method of drug .screening . . 
^/'T^^ ^^^y^^^^^ eukafybtic^ or prokaryotic host cells which are' stably transformed with recbmbiiiarit'^ '"'*^''' 

nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such 
transformed cells in competitive buiding assays. Such cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of complexes 
between polypeptides of the invention or fiagments and the agent being tested or examine the 
25 diminution in complex formation between the novel polypeptides and an appropriate cell line, 
which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate (i.e., 
increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
30 comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are microorgaoisms (including bacteria and 
35 fijngi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
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screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturaUy occurring) variants thereof. For a 
review, see Science 282:63-68 (1998). 

Combinatorial libraries are composed of large numbers of peptides, Dligonucleotides or 
organic compounds and can be readily prepared by traditional automated synthesis methods, 
PCR, cloning or proprietaiy synthetic methods. Of particular interest are peptide and 
oHgonucleotide combinatorial Ubraries. Still other libraries of interest include peptide, protem, 
peptidomimetic, multiparallel synthetic collection, lecombinatorial, and polypeptide Ubraries. 
For a review of combinatorial chemistry and libraries created therefiom, see Myers, Curr. Opin. 
Biotechnol 8:701-707 (1997). Forreviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al.. Mol Biotechnol, 9(3):205-23 (1998); Hmby et al., Cun Opin Chem Biol, 
1(1):1 14-19 (1997); Domer et al., BioorgMed Chem, 4(5):709-15 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various Ubraries described herein permits 
modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit" to bind a 
polypeptide of the invention. The molecules identified in the binding assay are then tested for 
antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art. In Mef, the molecules are titrated into apluraUty of ceU cultures or animals and then tested 
for either ceU/animal death or prolonged survival of the animal/cells. 

TiS&iiindiiig molecules Unis Identified may b^ coinplexed with iom^;e:g:; liciii'Sr - ' • - 
cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
complexed with imaging agents for targeting and imaging purposes. 

4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
Ugand or a receptor. The art provides numerous assays particularly usefiil for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For example, 
expression cloning using mammaUan or bacterial ceUs. or dihybrid screening assays can be used 
to identify polynucleotides encoding binding partners. As another example, affinity 
chromatography with the appropriate immobiUzed polypeptide of the invention can be used to 
isolate polypeptides that recognize, and bind polypeptides of flie invention. There are a nmnber 
of different Ubraries used for the identification of compounds, and in particular smaU molecules, 
tiiat modulate (i.e.. increase or decrease) biological activity of a polypeptide of the invention. 
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Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 

ligands, or cocktails of ligands to two cells populations that are genetically identical except for 

the expression of the receptor of the invention: one cell population expresses the receptor of the 

invention whereas the other does not. The response of the two cell populations to the addition of 

5 ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 

polypeptide of the invention in cells and assayed for an autocrine response to identify potential 

ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 

in the art can be used to identify blading partner polypeptides, including, (1) organic and 

inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 

1 0 comprised of random peptides, oligonucleotides or oiganic molecules. 

The role of downstream intracellular signaling molecules in the signaling cascade of the 
polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
proteiu, whose ligand has been identified, is produced in a host cell. The cell is then incubated 

1 5 with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 
the chimeric receptor. Known downstream proteins involved in intracelMar signaling can then 
be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the 
art can also be used to identify 'signaling molecules involved in receptor activity. 

20 4,10.15 ANTI-E^fcAMMATOR^ 

Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti-inflammatory activity may be achieved by providing a stimulus to cells involved in the 
inflammatory response, by inhibiting or promoting ceU-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the mflammatory 

25 process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfiision injury, 

30 endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn^s disease or resulting fi*om 
. over production of cytokines such as TNF or IL-1. Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of this invention inay be utilized to prevent or treat conditions such as, but not 

35 limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 



58 



"^OOVSim PCT/USOl/04098 
arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflaimnatoxy bowel disease, inflamation associated with puhnonar^ 
disease, other autoimmune disease or inflammatory disease, an antiproUferative agent such as for 
acute or chronic myogenous leukemia or m the prevention of premature labor secondaty to 
intrauterine infections. 
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4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblastic. promyelocytic, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic ' 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishman et al., 1985, Medicine, 2d Ed.. J.B. Lippmcott Co.. Philadelphia). 



4.10.17 NERVOUS SYSi™ DISORDERS 

Nervous system disorders, mvolving cell types which can be tested for efScacy of 
mtervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
therapeutic utility, include but are riot lunited to nervous systefe injuries, and diseases or 
disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyehnation. Nervous system lesions which may be treated in a patient (includmg 
human and non-human mammalian patients) according to the invention include but are not 
hmited to the following lesions of either the central (mcluding spinal cord, brain) or peripheral 
> nervous systems: 

(i) traumatic lesions, includmg lesions caused by physical injuiy or associated with 
surgeiy, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, m which a lack of oxygen m a portion of the nervous system 
results m neuronal mjury or death, including cerebral infarction or ischemia, or spmal cord 
infarction or ischemia; 

(iu) mfectious lesions, m which a portion of the nervous system is destroyed or injured 
as a result of mfection, for example, by an abscess or associated with infection by human 
immunodeficiency vmis. herpes zoster, or herpes sunplex virus or with Lyme disease, 
tuberculosis, syphiUs; 
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(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 

injured as a result of a degenerative process including but not limited to degeneration associated 

with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 

sclerosis; 

5 (v) lesions associated with nutritional diseases or disorders, in which a portion of the 

nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B 12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 
callosum), and alcoholic cerebellar degeneration; 
10 (vi) neurological lesions associated with systemic diseases including but not limited to 

diabetes (diabetic neuropathy, BelFs palsy), systemic lupus eryfliematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

1 S (viii) demyelinated lesions in which ai portion of the nervous system is destroyed or 

injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are usefid according to the invention for treatment of a nervous 

20 * system %sof3(&'*ma^^ 

differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

25 (iii) increased production of a neuron-associated molecule in culture or in vivo, e.g. , 

choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfimction in vivo. 
Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method set 

30 forth in Arakawa et al. (1 990, J. Neurosci. 1 0:3507-35 1 5); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown et al. 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 
be measured by bioassay, enzymatic assay, antibody binding. Northern blot assay, etc., 
depending on the molecule to be measurec^ and motor neuron dysfimction may be measured by 
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assessing the physical manifestation of motor neuron disorder, e.g., weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invention include but are not limited to disoiders such as infarction, infection, exposure to toxin, 
5 trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 
weU as other components of the nervous system, as well as disorders that selectively affect 
neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantUe and juvemle 
muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 
10 poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropalhy 
(Charcot-Marie-Tooth Disease). 

4.10.18 OTHER ACTIVITIES 

A polypeptide of the mvention may also exhibit one or more of the following additional 
activities or effects: inhibiting the growth, infection or function of, or kUIing, infectious agents, 
including, without limitation, bacteria, viruses, fiingi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as, for example, breast augmentation or diminutiori, change in bone form or shape); 
effecting bidrhythms or ciicadian cycled oir rhythms; effecting the fertiUty of male or female 
subjects; effecting the metabolism, catabolisin, anabolism, processing, utilization, storage or 
elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-fectors or other 
nutritional fedors or component(s); effecting behavioral characteristics, including; without 
limitation, appetite, Ubido, stress, cognition (including cognitive disorders), depression 
25 (including depressive disorders) and violent behaviors; providing analgesic effects or other pam 
reducing effects; promoting differentiation and growth of embryonic stem ceUs in lineages other 
than hematopoietic lineages; hormonal or endocrine activity; m tiie case of enzymes, correcting 
deficiencies of the enzyme and treating deficiency-related diseases; treatment of 
hypeiproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such 
30 as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 
in a vaccine composition to raise an immune response against such protein or anotiier material or 
entity which is cross-reactive with such protein. 

4.10,19 IDENTIFICATION OF POLYMORPHISMS 



20 
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The demonstration of polymorphisms makes possible the identification of such 
polymorphisms in hvunan subjects and the pharmacogenetic use of this information for diagnosis 
and treatment. Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
5 response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
polymoiphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying fhe presence of the 
polymorphism. 

1 0 Polymorphisms can be identified in a variety of ways known in the art vAAoh all 

generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 
involving isolation or amplification of the DNA, and identifying the presence of the 
polymorphism in the DNA. For example, PGR may be used to amplify an appropriate firagment 
of genomic DNA which may then be sequenced. Alternatively, flie DNA may be subjected to 
15 allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 

hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 
single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 
* ' ' ' : ^ . adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). ■ 
\- . . zvu^ is^ JP^ addition, toditipnal restriction fragment length polymorphism analysis (using restriction^ ^ 
'''T v^^" 26^"'^^ differential digestion of the genomic DNA^d^pending on the presence or?"^ 

absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 
present invention can be used to detect polymorphisms. The array can comprise modified 
nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 
25 invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence could 
also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., 
by an antibody specific to the variant sequence. 

30 4.10^0 ARTHRITIS AND INFLAMMATION 

The immimosuppressive effects of the compositions of fhe invention against rheumatoid 
arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at, 1983, 
Science, 219:56, or by B. Waksman et al., 1963, Int. Arch. Allergy Appl. Immunol., 23:129. 
35 Induction of the disease can be caused by a suigle injection, generally intradermally, of a 
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suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CPA). The 
route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 
mixture. ITie polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
1 -5 mg/kg. The control consists of administering PBS only. 

The procedure for testing the effects of the test compound would consist of mtradeimally 
injecting killed Mycobacterium tuberculosis in CPA followed by immediately administering the 
test compound and subsequent treatment every other day untU day 24. At 14, 15, 1 8, 20, 22, and 
24 days after injection of Mycobacterium CPA, an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data would reveal that the test compound 
would have a dramatic affect on the swelling of iJie joints as measured by a decrease of the 
arthritis score. 



4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
15 other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
include, but are not limited to, those exemplified herein. 

4.11.1 EXAMPLE 

One emijodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
disorder that can be modulated by regulating the peptides of the invention. While the mode of 
administration is not particularly important, parenteral administration is preferred. An 
exemplary mode of administration is to deUver an intravenous bolus. The dosage of the 
5 polypeptides or other composition of the invention will normaUy be determined by the 

prescribing physician. It is to be expected that the dosage wiU vary accoiding to the age, weight, 
condition and response of the individual patient. Typically, the amount of polypeptide 
administered per dose will be in the range of about O.Olng/kg to 100 mg/kg of body weight, with 
llie preferred dose being abqut O.lng/kg to 10 mg/kg of patient body weight For parenteral 
administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharmaceutically acceptable parenteral vehicle. Such vehicles are well known in the art 
and examples include water, saline. Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 
additives that maintain the isotonicity and stabiUty of the polypeptide or other active ingredient 
The prq)aration of such solutions is within the skill of the art. 
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4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source derived, 
5 including without limitation from recombinant and non-recombinant sources and including 
antibodies and other binding partners of the polypeptides of the invention) may be administered 
to a patient in need, by itself, or in pharmaceutical compositions vAi&rc it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 
may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 

10 fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art The term 
"pharmaceutically acceptable" means a non-toxic material that does not interfere with the 
effectiveness of the biological activity of the active ingredient(s). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 
invention may also contain cj^okines, lymphokines, or other hematopoietic factors such as 

15 M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-11, IL^12, 
IL-13, IL-14, IL-15, IFN, TNFO, TNFl, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 
factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 
. include various growth factors such as epidermal growth factor-(EGF), platelet-deriv 

20 factor (PDGF), transforming growtfif factors (TGF-a and TGF-p), insulin-like growth factor 
(IGF), as well as cytokines described herein. 

The pharmaceutical composition may fiirther contain other agents which either enhance 
the activity of the protein or other active ingredient or complement its activity or use in 
treatment. Such additional factors and/or agents may be included in the pharmaceutical 

25 composition to produce a synergistic effect with protein or other active ingredient of the 
invention, or to minimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti- 
inflammatory agent to minimize side effects of tiie clotting factor, cytokine, lymphokine, other 

30 hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as 
BL-lRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, immunosuppressive agents). A protein 
of the present invention may be active in multimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invention may comprise a protein of the invention in such multimeric or complexed form. 
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As an alternative to being included in aphannaceutical composition of the invention 
including a first protein, a second protein or a therapeutic agent may be concsuirently 
administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
Techniques for formulation and administration of the compounds of the instant appUcation may 
be found in "Remington's Pharmaceutical Sciences," Mack PubKshing Co., Easton, PA, latest 
edition. A therapeuticaUy effective dose further refers to that amount of the compound sufficient 
to result m amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of ti«a1ment, healing, prevention or 
ameUoration of such conditions. When ^plied to an individual active mgredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When appUed to a 
combination, a therapeuticaUy effective dose refers to combined amounts of the active 
ingredients that result in the therapeutic effect, whether admmistered in combination, serially or 
simultaneously. 

In practicing the method of tieatinent or use of the present invention, a tiierapeutically 
effective amount of protein or other active ingredient of the present mvention is administered to 
a mammal having a condition to be treated. Protein or other active ingredient of the present 
invention may be admmistered in accordance witii the method of the invention eitiier alone or in 
combmation with other therapies^sucfe as treatinents employing cytokines, lymphokines or other 
hematopoietic factors. When co- adriiinistered with one or more cytokines, lymphokines or otiier 
hematopoietic factors, protein or otiier active ingredient of tiie present invention may be 
administered eitiier simultaneously witii flie cytokine(s), lymphokine(s), otiier hematopoietic 
factorCs), tiirombolytic or anti-tiirombotic factors, or sequentially. If administered sequentially, 
tiie attending physician will decide on the appropriate sequence of administering protein or otiier 
active ingredient of tiie present invention in combination witii cytokine(s), lymphokine(s), otiier 
hematopoietic factor(s), thrombolytic or anti-thrombotic fectors. 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, tiansmucosal. or 
intestinal administi-ation; parenteral deUveiy, includmg mtiamuscular, subcutaneous, 
intrameduUary injections, as weU as inti'atiiecal, direct intraventiicular, intravenous, 
inttaperitoneal. intranasal, or intraocular injections. Administration of protem or otiier active 
ingredient of tiie present invention used in tiie phaimaceudcal composition or to practice tiie 
mefliod of flie present invention can be carried out in a variety of conventional ways, such as oral 
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ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

Altemately, one may administer the compound in a local rather than systemic manner, for 
example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in 
S a depot or sustained release formulation. In order to prevent tibie scarring process frequently 
occurring as complication of glaucoma surgery, the compounds may be administered topically, 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 
system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fihrotic tissue. The liposomes will be targeted to and taken up selectively by the 
10 afOicted tissue. 

The polypeptides of the invention are administered by any route that delivers an effective 
dosage to the desired site of action. The detennination of a suitable route of administration and 
an effective dosage for a particular mdication is within the level of skill in the art. Preferably for 
wound treatment, one administers the therapeutic compoxmd directly to the site. Suitable dosage 
15 ranges for the polypeptides of the invention can be extrapolated fi-om these dosages or from 
similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinician to provide maximal therapeutic benefit. 

. 4,124XPMPOSITIONS/FOR]V^ , ^ 

20 Pharmaceutical compositions for use in accordance with the present invention thus may 

be formulated in a conventional manner using one or more physiologically acceptable carriers 
comprising excipients and auxiliaries which facilitate processing of the active compounds into 
preparations which can be used pharmaceutically. These pharmaceutical compositions may be 
manufactured in a manner that is itself known, e,g, , by means of conventional mixing, 

25 dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 

lyophilizing processes. Proper formulation is dependent upon the route of administration chosen. 
When a therapeutically effective amount of protein or other active ingredient of the present 
invention is administered orally, protein or other active ingredient of the present invention will 
be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, 

30 the pharmaceutical composition of the invention may additionally contain a solid carrier such as 
a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% protein or 
other active ingredient of the present invention, and preferably from about 25 to 90% protein or 
other active ingredient of the present invention. When administered in Uquid form, a liquid 
carrier such as water, petroleimi, oils of animal or plant origin such as peanut oil, mineral oil, 

35 soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
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pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 
When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingreditait of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 
other active ingredient of the present invention will be in the form of a pyrogen-fiw, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 
active ingredient solutions, having due regard to pH, isotonicity, stabiUty, and the like, is within 
the skiU in the art A preferred pharmaceutical composition for intravenous, cutaneous, or • 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 
present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 
other vehicle as known in the art. The pharmaceutical composition of the present invention may 
also contam stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skiU in the art. For injection, the agents of the invention may be formulated in aqueous solutions, 
preferably in physiologically compatible buffers such as Hanks's solution. Ringer's solution, or 
physiological saline buffer. For transmucosal administration, penetrants appropriate to the 
•batrieFto be pemeated are used in the formulation. Such penetrants are generaUy known in thS^ - 
art. 

For oral administration, the compounds can be formulated readily by combining the 
active compounds with pharmaceutically acceptable carriers well known in the art. Such carriers 
enable Ihe compounds of the invention to be formulated as tablets, piUs, dragees, capsules, 
liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
treated. Pharmaceutical preparations for oral use can be obtained from a soHd excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; ceUulose 
preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
gum tragacanth, methyl cellulose, hydroxypropyhnethyl-cellulose, sodium 
carboxymelhylceUulose, and/or polyvinylpym)lidone (PVP). If desired, disintegrating agents 
may be added, such as the cross-linked polyvinyl pyiroKdone. agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 
purpose, concenlrated sugar solutions may be used, which may optionally contain gum arable. 
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talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be 
added to the tablets or dragee coatings for identification or to characterize different combinations 
of active compound doses. 
S Pharmaceutical preparations which can be used orally include push-fit capsules made of 

gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 
lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 
optionally, stabilizers. In soft capsules, the active compoitnds may be dissolved or suspended in 

10 suitable Uquids, such as fatty oils, liquid parafiBn, or Uquid polyethylene glycols. In addition, 
stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 
tablets or lozenges formulated in conventional maimer. 

For administration by inhalation, the compounds for use according to the present 

1 5 invention are conveniently delivered in the form of an aerosol spray presentation firpm 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g,, 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 
other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 
providing a valve to deliver a metered amount. Capsules and .cartridges of, e.g., gelatin for use in 

20 an inhaler or insxifflator may be formulated containing a powder mix of the compoiiiid and a 

suitable powder base such as lactose or starch. The compoimds may be formulated for parenteral 
adniinistration by injection, e,g. , by bolus injection or continuous infusion. Formulations for 
injection may be presented in unit dosage form, e.g., in ampules or in multi-dose containers, with 
an added preservative. The compositions may take such forms as suspensions, solutions or 

25 emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Sidtable lipophilic solvents or 

30 vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
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solutions. Alternatively, the active ingredient may be in powder fonn for constitution with a 
suitable vehicle, e.g., sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 
retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
glyceiides. In addition to the fomulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 
implantation (for example subcutaneously or ratramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be fonnulated with suitable polymeric or hydrophobic 
materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system VPD is a solution 
of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 
polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1 : 1 with a 5% dextrose m water solution. This co-solvent 
system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 
without destroying its solubiUty and toxicity characteristics. Furthermore, the identity of the 
co-solvent components may be varied: for example, other low-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 
biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyirolidone; and other 
sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 
hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 
known examples of deUvery vehicles' or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 
Additionally, tiie compounds may be delivered using a sustained-release system, such as 
semipermeable matiices of solid hydrophobic polymers containing Ihe therapeutic agent 
Various types of sustained-release materials have been established and are well known by those 
skilled in the art. Sustamed-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
biological stabiUty of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 
or excipients. Examples of such carriers or excipients include but are not limited to calcium 
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carbonate, calcium, phosphate, various sugars, starches, cellulose derivatives, gelatin, and 

polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 

provided as salts with phatmaceutically compatible counter ions. Such pharmaceutically 

acceptable base addition salts are those salts which retain the biological effectiveness and 

5 properties of the free acids and which are obtained by reaction with inorganic or organic bases 

such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 

monoalkylamine, dibasic anoino acids, sodium acetate, potassium benzoate, triethanol amine and 

the like. 

The pharmaceutical composition of the invention may be in the form of a complex of the 
10 protein(s) or other active ingredient(s) of present invention along with protein or peptide 

antigens. The protein and/or peptide antigen will deliver a stimulatoiy signal to both B and T 
lymphocytes. B lymphocytes will respond to antigen through their sur&ce immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 
presentation of the antigen by MHC proteins. MHC and structurally related proteins including 
1 S those encoded by class I and class n MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 
MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alterratiyely. antibodies able to bind surface ini^ 

well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 

• 20 pharmaceutical :C02hpQ5ition.of the invention. . . r : • rnviir^iisDE 

The pharmaceutical composition of the invention may be in the form of a liposome in 
which protein of the present invention is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 
micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Sizitable 
25 lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 
liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. 
Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 
herein by reference. 

30 The amount of protein or other active ingredient of the present invention in the 

pharmaceutical composition of the present invention will depend upon the nature and severity of 
the condition being treated, and on the nature of prior treatments which the patient has 
undergone. Ultimately, the attending physician will decide the amount of protein or other active 
ingredient of the present invention with which to treat each individual patient. Initially, tiie 

35 attending physician will administer low doses of protein or other active h^edient of the present 
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invention and observe the patient's response. Larger doses of protein or other active ingredient 
of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 
various pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 jig to about 100 mg (preferably about 0.1 ng to about 10 mg, more preferably 
about 0. 1 ng to about 1 mg) of protein or other active ingredient of the present invention per kg 
body weight. For conqwsitions of the present invention which are useful for bone, cartilage, 
tendon or Hgament regeneration, the therapeutic method includes administering the composition 
topically, systematically, or locally as an implant or device. When administered, the therapeutic 
composition for use in this invention is, of course, in apyrogen-fiee, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 
dehvery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 
active ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionaUy, be administered simultaneously or 
sequentially with the composition in the methods of the invention. Preferably for bone and/or 
cartilage formation,, the composition would mclude a matrix capable of delivering the 
protein-containing or other active ingredient-contaiiiing composition to the site of bone and/or 
cartilage damage, providing a structure for the developing bone and cartilage and optimally 
-capable of being resorbed mto the body. Such matrices may be formed of materials presently iii- 
use for other implanted medical appUcations. 

The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interface properties. The particular application of the 
compositions will defme the appropriate formulation. Potential matrices for the compositions 
may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 
hydroxyapatite, polylactic acid, polyglycoKc acid and polyanhydrides. Other potential materials 
are biodegradable and biologicaUy weU-defined, such as bone or dermal collagen. Further 
matrices are comprised of pure proteins or extraceUular matrix components. Other potential 
matrices are nonbiodegradable and chemicaUy defined, such as sintered hydroxyapatite, bioglass, 
aluminates. or other ceramics. Matrices may be comprised of combinations of any of the above 
mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 
tricalcium phosphate. The bioceramics may be altered in composition, such as in 
calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 
biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 
glycolic acid in the form of porous particles having diameters ranging fi-om 150 to 800 microns. 
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In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 
cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 
5 (including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose, 

hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 
carboxymethylcellulose, the most preferred bemg cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 
poly(ethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and poly(vinyl alcohol). 

10 The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, which represents the amount necessary to prevent desorption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 
protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 

15 compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 
question. These agents include various growfli factors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transforming gtowth factors (TGF-a and TGF-P), and 
insulin-like growth factor (IGF). * - 

20 The therapeutic compositions are also presently vaJuable for veterinary applications. 

Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 
regeneration will be deteraiined by the attending physician considering various factors which 

25 modify the action of the proteins, e.g,, amount of tissue weight desired to be formed, the site of 
damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue (e.g., 
bone), the patient's age, sex, and diet, the severity of any infection, time of administration and 
other clinical factors. The dosage may vary with the type of matrix used in the recpnstitution and 
with inclusion of other proteins m the pharmaceutical composition. For example, the addition of 

30 other known growtii factors, such as IGF I (insulin like growth factor I), to the final composition, 
may also effect the dosage. Progress can be monitored by periodic assessment of tissue/bone 
growth and/or repair, for example. X-rays, histomorphometric determinations and tetracycline 
labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 
35 polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
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mammalian subject Polynucleotides of the invention may also be administered by othw known 
methods for introduction of nucleic acid into a cell or organism (including, without limitation,, in 
the fonn of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 
proteins of the present invraition in order to proliferate or to produce a desired effect on or 
5 activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

4.123 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve its 
10 intended purpose. More specifically, a therapeutically effective amount means an amount 
effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is well within flie capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 
tiie method of the invention, tihe ther^jeutically effective dose can be estimated initially 3&om 

15 appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 
cu-culating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve a circulating 
concentration range that includes flie IC50 as determined in cell culture (i.e., the concentration of 
the test compound which achieves a half-maximal inhibition of Ihe protem's biological activity). 

20 Such infonimtion can be used to more, accurately detenmmeiKefuldosM 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 
efficacy of such compounxis can be determined by standard pharmaceutical procedures in cell 
cultures or e3q)erimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the 

25 population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD50 and ED50. Compounds which exhibit high therapeutic indices are preferred. 
The data obtained fi-om these cell culture assays and animal studies can be used in formulating a 
range of dosage for use in human. The dosage of such compounds lies preferably within a range 

30 of circulating concentrations that include the EDso witii httle or no toxicity. The dosage may 
vary within this range depending upon the dosage form employed and the route of administiation 
utilized. The exact formulation, route of administiration and dosage can be chosen by the 
individual physician in view of tiie patient's pondition. See, e.g., Fingl et al., 1975, in "The 
Pharmacological Basis of Therapeutics", Ch. 1 p.l. Dosage amount and mterval may be adjusted 

35 individuaUy to provide plasma levels of tiie active moiety which are sufScient to maintain the 
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desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 
bioassays can be used to determine plasma concentrations. 
5 Dosage intervals can also be determined using MEC value. Compounds should be 

administered using a regimen which maintains plasma levels above the MEC for 10-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local 
adjtninistration or selective uptake, the effective local concentration of the drug may not be 
related to plasma concentration. 
10 An exemplary dosage regimen for polypeptides or other compositions of the invention 

will be in the range of about 0.01 ixg/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about 0.1 ^g/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 
intervals. 

15 The amount of composition administered will, of course, be dependent on the subject 

being treated, on the subject's age and weight, the severity of the ajBIiction, the manner of 
administration and the judgment of the prescribing physician. 

4.12,4 PACKAGING 

20 The compositions may, if desired, be presented in a pack or dispenser device which may 

contain one or more unit dosage forms containing the active ingredient. The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
invention formxilated in a compatible pharmaceutical carrier may also be prepared, placed in an 

25 appropriate container, and labeled for treatment of an indicated condition. 

4.13 ANTIBODIES 

Also included in the invention are antibodies to proteins, or fragments of proteins of the 
invention. The term "antibody" as used herein refers to immunoglobulin molecules and 

30 immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immimoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, Fab, Fab' and F(ab')2 
fragments, and an Fab expression library. In general, an antibody molecule obtained from 
humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another 

35 by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well. 
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such as IgGi, IgGz, and others. Furtheimore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 
subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or a 
5 portion or fragment thereof, and additionally can be used as an immunogen to generate 

antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 
invention provides antigenic peptide fragments of the antigen for use as immunogens. An 
antigenic peptide fragment comprises at least 6 amino acid residues of the amino acid sequence 
10 of the full length protem, such as an amino acid sequence shown in SEQ ID NO:985, and 

encompasses an epitope thereof such that an antibody raised against the peptide forms a specific 
immune complex with the full length protein or with any fragment that contains the epitope. 
Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 15 amino 
acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 
epitopes encompassed by the antigenic peptide are regions of the protein that are located on its 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g., a 
hydrophilic region. A hydi;ophobicity analysis of the human related protein sequence wiU 
indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely • 
to encode surfece residues useful for targeting antibody production. As a means for targeting 
antibody production, hydropathy plots showing regions of hydrophiKcity and hydrophobicily 
may be generated by any method well known m the art, including, for example, the Kyte 
DooUttle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g., 
Hopp and Woods, 1981, Proc. Nat. Acad. Sci. USA 78: 3824-3828; Kyte and DooUttle 1982, J. 
Mol. Biol. 157: 105-142, each of which is incorporated herein by reference m its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
30 thereof, may be utilized as an immunogen in the generation of antibodies that 
mununospecifically bind these protein components. 

Various procedures known within the art may be used for Ihe production of polyclonal or 
monoclonal antibodies directed against a protein of the invention, or against derivatives, 
fragments, analogs homologs or orthologs thereof (see. for example. Antibodies: A Laboratory 
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Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 



5.13.1 Polyclonal Antibodies 
5 For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 

goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the immunogenic protem, or a 

1 0 recombinantly expressed immunogenic protein. Furthennore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albimiin, 
bovine thyroglobulin, and soybean trypsm inhibitor. The preparation can further include an 
adjuvant. Various adjuvants used to increase the immimological response include, but are not 

15 limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 
adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 

20 synthetic treMose dicorynomycolate). .r .: . .. ^ - . 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and fiirther purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 
fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 

25 target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a colunm to 
puriiy the immune specific antibody by immunoaflSnity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

30 5.13.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique li^t chain gene product and a unique heavy chain 
gene product In particular, the complementarity determining regions (CDRs) of the monoclonal 

35 antibody are identical in aU the molecules of the population. MAbs thus contain an antigen 



76 



wo 01/57190 PCT/USOl/04098 

binding site capable of immunoreacting with a particular epitope of the antigen characterized by 
a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and MUstein, Nature. 256:495 (1975). In a hybridoma method, a mouse, 
hamster, or other appropriate host animal, is typicaUy immunized with an immunizing agent to 
elicit lymphocytes ibat produce or are capable of producing antibodies Aat will specifically bind 
to the immunizing agrait. Alternatively, the lymphocytes can be immunized in vitro. 
The immunizing agent wiU typicaUy include the protein antigen, a fiagment thereof or a fusion 
protein thereof. Generally, either peripheral blood lymphocytes are used if ceUs of human origin 
are desired, or spleen cells or lymph node cells are used if non-human mammalian sources are 
desired. The lymphocytes are then fiised wifli an immortalized cell line using a suitable fusing 
agent, such as polyethylene glycol, to form a hybridoma ceU (Coding, Monoclonal AntihnHieQ- 
Principles and Practice, Academic Press, (1986) pp. 59-103). Immortalized ceU lines are usually 
transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. 
Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can be cultured in 
a suitable culture medium that preferably contains one or more substances that inhibit the growth 
or survival of the unfused, immortalized ceUs. For example, if the parental cells lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for 
the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 
medium"), which substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efSciently, support stable high level 
expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 
can be obtained, for instance, &om the Salk Institute Cell Distribution Center, San Diego, 
California and the American Type Culture CoUection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeloma cell lines also have been described for the production of human 
monoclonal antibodies (Kozbor, .T. Tmnnmnl 133:3001 (1984); Brodeur et al.. Monoclonal 
Antibodv Production Techniques and A pplications. Marcel Dekker, Inc., New York, (1987) pp. 
51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed for 
the presence of monoclonal antibodies directed against the antigaL Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
en2yme-linked umnunoabsorbent assay (ELISA). Such techniques and assays are known in the 
art. The binding affinity of the monoclonal antibody can, for example, be determined by the 
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Scatchard analysis of Munson and Pollard, Anal. Biochem., 102:220 (1980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by limiting 
S dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 medium. 
Alternatively, the hybridoma cells can be grovm in vivo as ascites in a mammal. 
The monoclonal antibodies secreted by the subclones can be isolated or purified fi:om the culture 
medium or ascites fluid by conventional immunoglobulin purification procedures such as, for 

10 example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, dialysis, or 
afSnity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 
invention can be readily isolated and sequenced using conventional procedures (e.g., by using 

15 oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA, Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 
myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 

20 rndnocldhal antibodies in the recombinant host cells. The DNA also can be modified, for 

example, by substituting the coding sequence for himian heavy and light chain constant domains 
in place of the homologous .murine sequences (U.S, Patent No. 4,8 1 6,567; Morrison, Nature 368. 
812-13 (1994)) or by covalently joining to the immunoglobulin coding sequence all or part of the 
codmg sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 

25 polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 

5.13.2 Humanized Antibodies 

30 The antibodies directed against the protein antigens of the invention can fiarther comprise 

humanized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 
immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', F(ab')2 or other antigen- 

35 binding subsequences of antibodies) that are principally comprised of the sequence of a human 
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inununoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et al., 
iiature, 321:522-525 (1986); Riechmann et al.. Nature. 332:323-327 (1988); Verhoeyen et al., 
Sci^ice, 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 
corresponding sequences of a human antibody. (See also U.S. Patent No. 5^25,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found ndtiier 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 
humanized antibody wiU comprise substantiaUy all of at least one, and typically two, variable 
domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the firework regions are those of a human 
immunoglobulin consensus sequence. Hie humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobulin (Jones et al., 1986; Riechmann et al.. 1988; and Presta, Cuir. Op. Struct. Rini 
15 2:593-596(1992)). 

5.133 Human Antibodies 

. I^'^yJi'man antibodies relate to antibody molecules in which essentially the entire 
sequencgs bf botii the tight chain and tiie heavy chain, includmg the CDRs, arise from human 
20-: genes.- Sueh antibodies are termed "human antibodies", or "fully human antibodies" herein. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: Monoclonal 
Antibodies and Cancer TteRAPY, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 
antibodies may be utilized in tiie practice of the present invention and may be produced, by using 
human hybridomas (see Cote, et al., 1983. Proc Nad Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells witii Epstein Barr Virus in vitro (see Cole, et al., 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 
30 including phage display libraries (Hoogenboom and Winter, IMoLBioL. 222:381 (1991); 
Marks etal.,iMoLJioL 222:581 (1991)). Similarly, human antibodies can be made by ' 
introducing human immunoglobulin loci into tiansgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partiaUy or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in humans 
35 in all respects, including gene rearrangement, assembly, and antibody repertoire. Hiis approach * 
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is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in Marks et al, rBio/Technology iO, 779-783 (1992)); Lonberg et al. 
• (Nature 368 856-859 (1994)); Morrison ( Nature 368, 812-13 (1994)); Fishwild et al,( Nature 
Biotechnology 14. 845-51 (1996)); Neuberger (Nature Biotechnology 14, 826 (1996)); and 
5 Lonberg and Huszar (Intern. Rev. Immunol. 13 65-93 (1 995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
vAnch are modified so as to produce fully human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 
endogenous genes encoding the heayy and light immunoglobulin chains in the nonhuman host 

10 have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 
transgenic animals containing fewer than the fiill complement of the modifications. The 

15 preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells 
which secrete fully human immunoglobulins. The antibodies can be obtained directly fi"om the 
animal after immunization with an immunogen of interest, as, for example, a'preparation of a 
. * polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as . 

20^. hybridomas producing monoclorial antibodies. Addiitidnaily, the genes "encdding tlie * 
immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be ftirther modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 

An example of a method of producing a nonhuman host, exemplified as a mouse, lacking 

25 expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 

5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of tiie 
locus and to prevent formation of a transcript of a rearranged knmunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 

30 and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 
U.S. Patent No. 5,916,771. It includes introducing an expression vector that contains a 
nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 

35 an expression vector containing a nucleotide sequence encoding a Ught chain into another 
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mammalian host cell, and fusing the two cells to fonn a hybrid ceU. The hybrid cell expresses an 
antibody containing the heavy chain and the light cham. 

In a fijrther improvement on this procedure, a method for identifying a clinically relevant 
epitope on an immunogen, and a correlative method for selecting an antibody that binds 
immunospecifically to the relevant epitope with high afSnily, are disclosed in PCX pubUcation 
WO 99/53049. 

5.13.4 Fab Fragments and Sin^e Chain Antibodies 

According to the invention, techniques can be ad^ for the production of single-chain 
antibodies specific to an antigenic protein of the invention (see e,g., U.S. Patent No. 4 ^46,778). 
In addition, methods can be adapted for the construction of Fab expression libraries (see e.g., 
Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective identification of 
monoclonal Fab fiagments with the desired specificity for a protein or derivatives, fragments, 
analogs or homologs thereof. Antibody fiagments that contam the idiotypes to a protein antigen 

> may be produced by techniques known in the art mcluding, but not limited to: (i) an F(ab )2 
fiagment produced by pepsin digestion of an antibody molecule; (ii) an Fab fiagment generated 
by reducing the disulfide bridges of an F(ab02 fragment; (iii) an Fab fiagment generated by the 
treatment of llie antibody molecule with papain and a reducing agent and (iv) Fv fragments. 

► 5.13.5 Bispecific Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of tlie 
binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit. 

Methods for making bispecific antibodies are known in the art. TraditionaUy, the 
recombinant production of bispecific antibodies is based on the co-expression of two 
immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have di£ferent 
specificities (Milstein and CueUo, Nature , 305:537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and Ught chains, these hybridomas (quadromas) produce a 
potential mixture of ten different antibody molecules, of which only one has the correct 
bispecific structure. The purification of the correct molecule is usually accomplished by affinity 
chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, andin Traunecker ef fl/., 1991 EMBOJ., 10:3655-3659. 

Antibody variable domains with tiie desired binding specificities (antibody-antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. Hie fusion 
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preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 
the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) contaming the site necessary for light-chain bindmg present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 
5 light chain, are inserted mto separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 
al.. Methods in Enzvmologv. 121:210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 
10 recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CHS region of an antibody constant domain. In this method, one or more small amino acid side 
chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 
15 acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 

Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 
F(ab')2 bispecific antibodies): Techniques for generating bispecific antibodies from antibody 
fragmeats have been described,in tib^^ literature, For example, bispecific antibodies can be 
20 prepared using chemical liiikage;VBreiiian et al., Science 229:81 (1985) describe a procedure ' 
wherein intact antibodies are proteolytically cleaved to generate F(ab')2 fragments. These 
fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to 
stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab' fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 
25 derivatives is then reconverted to the Fab'-thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab'-TNB derivative to form the bispecific 
antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab' firagments can be directly recovered from E. coli and chemically 
30 coupled to fonn bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(ab')2 molecule. Each Fab' fiagment 
was separately secreted from E. coli and subjected to directed chemical couplmg in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressuig the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 
35 of human cytotoxic lymphocytes against human breast tumor targets. 
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Various techniques for making and isolating bispecific antibody fragments directly from 
recombinant ceU culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kostetoy etal., JJmmunoL 148(5):1547-1553 (1992). The 
leucine zipper peptides firom the Fos and Jun proteins were linked to the Fab' portions of two 
5 different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 
also be utilized for the production of antibody homodimers. The "diabody" technology 
described by Hollinger et al., Proc. Natl. Acad Sri TTSA 90:6444-6448 (1993) has provided an 
alternative mechanism for making bispecific antibody fragments. ITie ftagments comprise a 
10 heavy-chain variable domain (Vh) connected to a light-chain variable domain (VO by a linker 
which is too short to aUow pairing between the two domains on the same chain. Accordingly, 
the Vh and Vl domains of one fragment are forced to pair with the complementary Vl and Vh 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 
15 reported. See, Gruberetal.,JJfaamunoL 152:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 
•antibodies can be prepared. Tutt et al., J. Immunol. 147:60 (1991). 
Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
- - -originates in the protein, antigen of the invention. Alternatively, an aati-antigenic arm of an 
3 ammundglobulin molecule can be combined with an arin which binds to a triggering molecule on 
a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for 
IgG (FcyR), such as FcyRI (CD64), FcyRII (CD32) and FcyRHI (CD16) so as to focus cellular 
defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies can also 
be used to direct cytotoxic agents to cells which express a particular antigen. Hiese antibodies 
possess an antigen-binding arm and an arm which binds a cytotoxic agent or a radionuclide 
chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest 
binds the protein antigen described herein and ftirfher binds tissue fector (TF). 

5.13.6 Heteroconjugate Antibodies 

Heteroconjugate antibodies are also within flie scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 
have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 
protein chemistry, including those involving crosslinking agents. For example, immunotoxms 



83 



wo 01/57190 PCT/USOl/04098 
can be constructed vising a disulfide exchange reaction or by forming a thioether bond 
Examples of suitable reagents for this purpose include iminothiolate and methyM- 
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 

S 5.13.7 Effector Function Engineeruig 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
to enhance, e.g., the effectiveness of die antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this regioiL The homodimeric antibody thus generated can have improved 

1 0 internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et al., J. E>qp Med., 176: 1191-1195 (1992) 
and Shopes, J. Immunol., 148: 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 

15 has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

5,13.8 Immimoconj uga tes ' ' , ' 

The invention also pertains to immunoconjugates comprising an.antibody conjugated to a 
»^ '"2'd~ cytotoxic "agent such as a chemotKerapeutic agenCtoxiii (e.g., an eiizyrhaticMIy active fo ? " 

bacterial, fungal, plant, or animal origin, or fragments thereof); or a radioactive isotope (i.e., a 
radioconjugate). 

Chemotherapeutic agents useful in the generation of such immunoconjugates have been 
described above. Enzymatically active toxins and fragments thereof that can be used include 
25 diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A cham, modeccin A chain, alpha-sarcin, 
Aleurites fordii proteins, dianthin proteins, I^hytolaca americana proteins (PAPI, PAPII, and 
' PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, 

mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of 
30 radionuclides are available for the production of radioconjugated antibodies. Examples include 
"%"V°Y,and^'«Ue. 

Conjugates of the antibody and cytotoxic agent are made using a variety of bifunctional 
protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate (SPDP), 
iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 
35 active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
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compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazoniiun derivatives (such as 
bis-(p-dia2oniumbenzoyl)-ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyaiiate), 
and bis-active fluorine compounds (such as l,5-difluoro-2,4-dinitrobenzene). For example, a 
ricin immunotoxin can be prepared as described in Vitetta et al.. Science, 238: 1098 (1987). 
Carbon-14-labeled l-isothiocyanatoben2yl.3-methyldiethylene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionucleotide Id the antibody. See 
WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization m tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) tiiat is in turn 
conjugated to a cytotoxic agent. 

4.14 COMPUTER READABLE SEQUENCES 

In one application of Ms embodiment, a nucleotide sequence of tiie present invention can 
be recorded on computer readable media. As used herein, "computer readable media" refers to 
any medium which can be read and accessed dkectiy by a computer. Such media include, but 
are not hmited to: magnetic storage media, such as. floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 
and ROM; and hybrids of tiiese categories such as magnetic/optical storage media. A skiUed 
artisan can readily appreciate how any of tiie presentiy known computer readable mediums can 
be used to create a manufacture comprising computer readable medium having recorded tiiereon 
a nucleotide sequence of the present invention. As used herein, "recorded" refers to a process for 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 
presentiy known methods for recording information on computer readable medium to generate 
manufactures comprising tiie nucleotide sequence information of tiie present invention. 

A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded tiiereon a nucleotide sequence of tiie present 
invention. The choice of tiie data storage structiire wiU generaUy be based on tiie means chosen 
to access tiie stored information. In addition, a variety of data processor programs and formats 
can be used to store tiie nucleotide sequence information of tiie present invention on computer 
readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented m tiie form of an ASOI file, stored in a database application, such as DB2, Sybase, 
Oracle, or tiie like. A skilled artisan can readUy adapt any number of data processor structimng 
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fonnats (e.g. text file or database) in order to obtain computer readable medium having recorded 
thereon the nucleotide sequence information of the present invention. 

By providmg any of the nucleotide sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 
or 3949-3954 or a representative fragment thereof; or a nucleotide sequence at least 95% 
5 identical to any of the nucleotide sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 
3949-3954 in computer readable form, a skilled artisan can routinely access the sequence 
information for a variety of purposes. Computer software is publicly available which allows a 
skilled artisan to access sequence information provided in a computer, readable medium. The 
examples which follow demonstrate how software which implements the BLAST (Altschul et 

10 al., J. MoL Biol. 215:403-410 (1990)) and BLAZE (Bratlag et al., Comp. Chem. 17:203-207 
(1993)) search algorithms on a Sybase system is used to identify open reading frames (ORFs) 
within a nucleic acid sequence. Such ORFs may be protein encoding fragments and may be 
usefiil in producing commercially important proteins such as enzymes used in fermentation 
reactions and in the production of commercially useftil metabolites. 

15 As used herein, "a computer-based system" refers to the hardware means, software 

means, and data storage means used to analyze the nucleotide sequence infomiation of the 
present invention. The minimum hardware means of the computer-based systems of the present 
, invention comprises a central processing unit (CPU), input means, output means, and data 

. . . storage means. A skilled artisan can readily appreciate that any one of the currently available 

^O""" computer-based systems "are suitable for use in the present invention. As stated above, the 

computer-based systems of the present mvention comprise a data storage means having stored 
therein a nucleotide sequence of the present invention and the necessary hardware means and 
software means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory which can store nucleotide sequence information of the present 

25 invention, or a memory access means which can access manufectures having recorded thereon 
the nucleotide sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are implemented 
on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 

30 fragments or regions of a known sequence which match a particular target sequence or target 
motif. A variety of known algorithms are disclosed publicly and a variety of commercially 
available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith-Waterman, MacPattem (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 

35 skilled artisan can readily recognize that any one of the available algorithms or implementing 
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software packages for conducting homology searches can be adapted for use in the present 
computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence wiU be 
present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 
residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

As used herein, "a target structural motif," or "target moti^" refers to any rationaUy 
selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
three-dimensional configuration which.is formed upon the folding of the target motif There are 
a variety of target motifs known in the art Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motife include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 



. 4.15 TRIPLE HELIX FORMATI6N 

In addition, the fragments of the present mvention, as broadly described, can be used to 
0 > coiifiol gene expression Ihroti^ triple helix fonnation tfr antisense DNA or RNA^ both of which' 
methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al.. Science 15241 :456 (1988); and Dervan 
et al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Ohnno, J. Neurochem. 56:560 
(1991); OUgodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple heUx-fonnation optimally results in a shut-off of RNA transcription 
from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
one of the ORPs of the present invention, or homolog thereof in a test sample, using a nucleic 
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acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 
with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
5 for a period sufficient to form the complex, and detecting the complex, so that if a complex is 
detected, a polynucleotide of the invention is detected in the sample. Such methods can also 
comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention under such conditionjs, and amplifying aimealed 
_ polynucleotides, so fliat if a polynucleotide is amplified, a polynucleotide of the invention is 

10 detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise contacting 
a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 

15 In detail, such methods comprise incubating a test sample with one or more of the 

antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
- Incubation conditions depend on the format employed in the assay, the detection methods 

20 employed, and the type and nature of the nucleic acid probe or antibody, used in the assay.' One 
skilled in the art will recognize that any one of the commonly available hybridization, 
amplification or immunological assay foraiats can readily be adapted to employ the nucleic acid 
probes or antibodies of the present invention. Examples of such assays can be found in Chard, 
T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, 

25 Amsterdam, The Netherlands (1 986); Bullock, G.R. et al.. Techniques in Immunocytochemistry, 
Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, The Netherlands (1985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 

30 sputum, blood, serum, plasma, or urine. The test sample used in the above-described metiiod 
will vary based on the assay format, nature of the detection method and the tissues, cells or 
extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utiUzed. 
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In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to cany out the assays of the present invention. Specifically, the invention 
provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
invention; and (b) one or more other containers comprising one or more of the foUowing: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

In detaU, a compartment kit includes any kit in which reagents are contained in separate 
containers. Such containers include smaD glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiently transfer reagents fiom one compartment to 
another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test 
sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and contamers which 
contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic add probes, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 
reacting with the labeled antibody. One skiUed in the art will readily recognize that the disclosed 
probes and antibodies p/ the present invcaitira can be readily incorporated into one of the 
> established kit formats which are well known in tiie art: 

4.17 MEDICAL IMAGING 

The novel polypeptides and binding partners of the invention are usefiil in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et al., U.S. Pat. NO. 5,413,778. Such methods mvolve chemical attachment of 
a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 
pharmaceutically acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 



4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the invention, the present invention 
further provides methods of obtaining and identifying agents which bind to a polypeptide 
encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO: 
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1-984, 1969-2952, 3937-3942 or 3949-3954, or bind to a specific domain of the polypeptide 

encoded by the nucleic acid. In detail, said method comprises the steps of: ' 

(a) contacting an agent with an isolated protein encoded by an ORP of the present 

invention, or nucleic acid of the invention; and 

5 (b) determining whether the agent binds to said protein or said nucleic acid. 

In general, therefore, such methods for identifying compounds that bind to a 

polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 

the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 

the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 

10 to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 

polypeptide of the invention can comprise contacting a compound wiA a polypeptide of the 

invention for a time sufficient to form a polypeptide/compound complex, and detecting the 

complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 

1 5 polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 

comprise contacting a compound with a polypeptide of the invention in a cell for a time 

• sufficient to form a.polypeptide/compound complex, vdierein the complex drives expression of a 

receptor gene sequence in tiie cell, and detecting the complex by detecting reporter gene 

20 - sequence expression;>so.^that if a-polypeptide/compound complex ly detected, a compoxmd that : - 

binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 

activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 

activity observed in the absence of the compound). Alternatively, compounds identified via such 

25 methods can include compounds which modulate the expression of a polynucleotide of the 

invention (that is, increase or decrease expression relative to expression levels observed in the 

absence of the compound). Compounds, such as compounds identified via the methods of the 

invention, can be tested using standard assays well known to those of skill in the art for their 

ability to modulate activity/e5q)ression. 

30 The agents screened in the above assay can be, but are not limited to, peptides, 

carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 

and screened at random or rationally selected or designed using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 

the like are selected at random and are assayed for their ability to bind to the protein encoded by 

35 the ORP of the present invention. Alternatively, agents may be rationaDy selected or designed. 
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As tised herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 
readily adapt currently available procedures to generate peptides, phannaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
antipeptide peptides, for example see Hurby et al.. Application of Synthetic Peptides: Antisense 
Peptides," In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 
Kaspczak et al.. Biochemistry 28:9230-8 (1989). or pharmaceutical agents, or the like. 

In addition to the foregoing, one class of agents of the present invention, as broadly 
described, can be used to control gene expression through binding to one of the ORFs or EMFs 
of the present invention. As described above, such agents can be randomly screened or 
rationaUy designed/selected. Targeting the ORF or EMF allows a skiUed artisan to design 
sequence specific or element specific agents, modulating tiie expression of either a single ORF or 
multiple ORFs which rely on the same EMF for e^qjr^on contcol. One class of DNA binding 
agents are agents which contain base residues which hybridize or form a triple heUx formation 
by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, 
ribonucleic acid backbone, or can be a variety of sulfliydiyl or polymeric derivatives which have 
base attachment capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
,d^i^ed tote goinplementoy to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al.. Science 241 :456 (1988); and Dervan^et 
al.. Science 251:1360 (1991)) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Aotisense Inhibitors of Gene Expression, CRC Press, Boca ' 
Raton, PL (1988)). Triple helbc-formation optimally results in a shut-off of RNA transcription 
from DNA, while antisense RNA hybridization blocks translation of aa mRNA molecule into 
5 polypeptide. Both techniques have been demonsti:ated to be eflfective in model systeins. 

hifoimation contained in tiie sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present invention can 
be used as a diagnostic agent Agents which bind to a protein encoded by one of the ORFs of &e 
present invention can be formulated using known techniques to generate a phannaceutical 
composition. 
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4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
hybridization probes capable of hybridizing witii naturally occurring nucleotide sequences. ITie 
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hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. Because the 
corresponding gene is only expressed in a limited number of tissues, a hybridization probe 
derived froni of any of the nucleotide sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 
5 3949-3954 can be used as an indicator of the presence of RNA of cell type of such a tissue in a 
sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PGR as described in US Patents Nos. 4,683,195 and 4,965,188 provides 
additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 

10 PGR may be of recombinant origin, may be chemically synthesized, or a mixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include the 
cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 

15 are known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 



' nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 

20 chromosome using well known genetic and/or chromosomal mapping techniques. These 

techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
chromosome spreads has been described, among other places, in Verma et al (1988) Human 
25 Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
of genetic map data can be found in the 1 994 Genome Issue of Science (265 : 1 98 1 f). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 
30 predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 
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420 PREPARATIONOF SUPPORT BOUND OUGONUCLEOTTOES 
Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oHgonucleotideby chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 

Support bound oUgonucleotidesmay be preparedby any of the methods known to those of 
skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oUgonucleotides synthesized by standard synthesiTers. Immobilization can be 
achieved using passive adsorption (Inouye & Hondo, (1990) J. Clin. Microbiol. 28(6) 1469-72); 
using UVUght(Nagatae/ai, 1985;Dahlene/a/., 1987;Monissey&Collins,(1989)Mol.CeU 
Probes3(2) 189-207) or by covalentbindingofbasemodifiedDNA (Keller era/., 1988; 1989); all 
referencesbeing specificallymcorporatedhetem. 

Another strategy lhat may be employed is the use of the strong biotin-sfreptavidin 
interactionas a linker. Forexanq)le,Broudee/fli (1994)Proc. Natl. Acad. Sd. USA 91(8) 3072-6, 

describetheuseofbiotinylatedprobes,althoughtheseareduplexprobes,thatareimmobihzedon ' 
streptavidin-coatedmagnetic beads. Streptavidin-coated beads may be purchased from Dynal, Oslo. 
Of course, this same linking chemistry is appUcableto coating any surface with streptavidin. 
Biotinylatedprobes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (Naperville, EL) is also seUing suitable material that could be used. Nunc 
E-aboratorieshavedevelopeda method by which DNA ca^ 

surfecetennedCovalinkNH. CovaLinkNH is a polystyrene surface grafted with secondary amino 
groups (»JH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 

purchased from Nunc Laboratories. DNAmolecules may be bound to CovaLinkexclusivelyat the 
5'-end by a phosphoramidate bond, allowing immobilization of more tiian 1 pmol of DNA 
(Rasmussene/a/., (1991) Anal. Biochem. 198(1) 138-42). 

The use of CovaLink NH strips for covalent binding of DNA molecules at the 5'-end has 
been described (Rasmussen et al., (1 991 ). In tiiis technology, a phosphoramidate bond is employed 

(Chuetal.,(1983)Nucleic Acids Res. 11(8) 6513-29). Hiis is beneficial as immobilizationusing 
only a single covalent bond is preferred. The phosphoramidate bond joins tiie DNA to the 
CovaLinkNH secondary amino groups tiiat are positioned at flie end of spacer arms covalenfly 
graftedontothepolystyrenesurfacethrougha2nmlongspacerarm. TolinkanoHgonucleotideto 
CovaLmkNH via anphosphoramidatebond, the oligonucleotide terminus must have a 5'-end 
phosphate group. It is, perhaps, evenpossible for Wotin to be covalentiy bound to CovaLinkand 
Ihen streptavidin used to bind the probes. 
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More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul) and 
denaturing for 1 0 min. at 95*'C and cooling on ice for 10 min. Ice-cold 0. 1 M 1-methylimidazole, 
pH 7.0 (1-Melni7), is then added to a final concentration of 10 inM l-Melm?. Ass DNA solution is 
then dispensed into CovaLinkNH strips (75 ul/well) standing on ice. 
5 Carbodiimide 0.2 M 1 -ethyl-3-(3-dimethylaminopropyl)-carbodiiniide (EDC), dissolved in 

1 0 xnbA 1 -Melmy, is made firesh and 25 ul added per well. The strips are incubated for 5 hoxxrs at 
50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50''C). 

10 It is contemplated that a fiirther suitable method for use with the present invention is that 

describedin PCT Patent Application WO 90/03382 (Southern & Maskos), incorporatedherein by 
reference. This metiiod of preparing an oligonucleotide bound to a support involves attachiiig a 
nucleoside 3 -reagent through the phosphate group by a covalent phosphodiester link to aliphatic 
hydroxyl groups carried by the support. The oligonucleotide is then synthesized on the supported 

1 5 nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 
conditions that do not cleave the oligonucleotide fix)m the support. Suitable reagents include 
nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrays may be employedr.;F<M-example, addressable laser-activated photodeprotectionmay be . 

20 employed in the cheinibal synthesis of oligonucleotides directiy on a glass surface, as described by 
FodoietaL (1991) Science 251(4995) 767-73, incorporatedherein by reference. Probes may also 
be immobilized on nylon supports as described by Van Ness etcd. (1991) Nucleic Acids Res. 
19(12) 3345-50; or linked to Teflonusing the method of Duncan & Cavalier (1988) Anal. Biochem. 
169(1) 104-8; all references being specifically incorporatedherein. 

25 To link an oligonucleotideto a nylon support, as described by Van Ness et oL (1991), 

. requires activation of the nylon surface via alkylation and selective activation of the 5'-amine of 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
light-generated syntiaesis described by Pease et a/., (1994) PNAS USA 91 (1 1) 5022-6, mcoiporated 

30 herein by reference). These authors used current photolithographic techniques to generate arrays of 
immobilized oligonucleotide probes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotide probes in high-density, ininiaturized arrays, utilize photolabile 
5 -protected iV-acyl-deoxynucleosidephosphoramidites, surface linker chemistry and versatile 
combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotide probes may be 

3 5 generated in this manner. 
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4 PREPARATION OF NUCLEIC ACD) FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts, and RNA, 
includingmRNAxvithoutanyamplificationsteps. For example, Sambrooke/ai (1989) describes 

5 threeprotocolsfortheisolationof highmolecularweightDNAfrommanimaliancells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in Ml 3 , plasmid or lambda vectors and/or 
prepared directly from genomic DNA or cDNA by PGR or other amplificationmethods. Samples 
may be prepared or dispensed in muftiwell plates. About 100-1000 ng of DNA samplesmay be 
10 prepared in 2-500 ml of fitoal volume. 

TTienucleicacidswouIdthenbefiagmentedbyanyofthemelbodsknovmtolhoseofsldll 
in the art including, for example, using iestrictionen2ymes as described at9.24-9.28 of S^^^ 
al (1989), shearii^ by ultrasound and NaOH treatment 

Lowpressureshearingis also appropriate,as describedby Schrieferefa/. (1990) Nucleic 
15 AcidsRes. 18(24) 7455-6, incorporatedherein by reference). In this method, DNA samples are 
passed througha small French pressure cell at a variety of low to intemiediatepressures. A lever 
deviceallowscontroUedapplicalionof low to intemiediatepressuresto the cell. Hie results of these 

studiesindicatethatlow-pressureshearingisause&lalteraativetosomcandenz^ - 
ftagmentationmethods. 

20-ii'->^"- ^'^-OriepafticMarlygMtablewayfo 

base recognition endonuclease. Cvf JI, described by Fitzgerald et al. (1 992) Nucleic Acids Res. 

20(14)3753-62. These authorsdescribed an approachfor the rapidfragmentationand fractionation 
of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

^5 The restrictionendonuclease CviJl nomially cleaves the recognition sequence PuGCPy 

between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 
this enzyme(CvzJI**),yieIdaquasi-randomdistributionofDNAfragmeQts form the small 
moleculepUC19 (2688 base pairs). Fitzgerald era/. (1992) quantitativelyevaluatedthe 
randomnessof this fragmentation strategy, using a Cv/JI** digest ofpUCl 9 that was size 

0 fi^ctionatedbyarapidgelfiltrationmethodanddirectiyKgate4withoutendrepair,toalacZmm^ 
M13 cloningvector. Sequence analysis of 76 clones showed that Cvai**iBstrictspyGCPy and 

PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulalBdatarate 
consistent with random fragmentation. 

As reported in the literature, advantagesof this appioachcompared to sonicationand 
agarose gel fractionation include: smaUer amounte of DNA are required (0.2-0.5 ug instead of 2-5 
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ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
electrophoresisand elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is 
important to denature the DNA to give single stranded pieces available for hybridization. This is 
S achieved by incubating the DNA solution for 2-5 minutes at 80-90^C. The solution is then cooled 
cfuickly to 2°C to prevent renaturation of the DNA fiagments before they are contacted with the 
chip. Phosphate groups must also be removed from genomic DNA by metiiods known in the ait 

422 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane, 
1 0 Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microtiter plate) to repeated by transfer of about 20 nl of a DNA solution to a 
nylon membrane. By oflfeet printing, a density of dots higher than the density of the wells is 
achieved. One to 25 dots may be accommodated in 1 mm^.depending on the type of label used. By 
avoiding spotting in some preselectednumber of rows and columns, separate subsets (subarrays) 
1 5 may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 
subarrays may represent replica spotting of the same samples. In one example, a selected gene , 
segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 

% , . . . X .one 96-well plate (all 96 wells containing the-same sample). A plate for eaph of the 64 patients is • 

* • * - * 

20 prepared. By using a 96-pin device, all samples may be spotted on one 8 x 12 cm membrane. 

Subarrays may contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 

dot span may be 1 mm^ and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, Illinois) 

which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid 
25 being similar to the sort of membrane apphed to the bottom of multiwell plates, or hydrophobic 

strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 

screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 
3 0 may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods vMch are 
ftinctionally equivalent are within the scope of the invention. Indeed, numerous modifications and 
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variations in the practice of the invention are expected to occur to those skilled in the art upon 
considerationof the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claiins. 

AU references cited within the body of the instant specificationare hereby incoiporatedby 
reference in their entirety. 

5.0 EXAMPLES 

5.1 EXAMPLE 1 

Novel Nucleic Acid Sequ ences Obtained From Various Libraries 
A pluraUty of novel nucleic acids were obtained from cDNA Ubraries prepared from various 
human tissues and in some cases isolated from a genomic Hbrary derived from human chromosome 
using standardPCR, SBH sequence signature analysis and Sanger sequencing techniques. The 
inserts of the Ubraiy were amplified with PGR using primers specific for the vector sequences which 
flank the inserts. Clones fix>mcDNA libraries were spotted on nylon membrane filters and screened 
with oligonucleotideprobes (e.g., 7-mers) to obtain signature sequences. The clones were clustered 
into groups of simflar or identical sequences. Representative clones were selected for sequencing. 

In some cases, the 5' sequence of the amplified inserts was then deduced using a typical 
Sanger sequencingprotocol. PGR products werepurified and subjected to fluorescent dye 
. terminator cycle sequencing. . Single pass gel sequencing was done using a 377 Applied Biosystans 
(ABI) sequencer to obtain the novel nucleic a^id sequences. In some cases RACE (Random 
20 AmpHficationof cDNAEnds) wasperfoimedto fiirther extend the sequenceinthe 5' direction. 

5.2 EXAMPLE 2 

Assembla ge of Novel Nucleic Adds 

The cx)ntigs or nucleic acids of the present invention, designated as SEQ ID NO: 1969-2951, 
and 3949-3954 were assembledusing an EST sequence as a seed. Then a recursive algorithm was 
25 used to extend the seed EST into an extended assemblage, by pulling additional sequences from 
different databases (i.e.. Hyseq's database containing EST sequences, dbEST version 1 14, gb pri 
1 14, and UniGene version 101) that belong to this assemblage. The algorithm tenninated when 
there was no additional sequences from the above databases that would extend the assemblage. 
Inclusion of component sequences into the assemblage was based on a BLASTN hit to the 
30 extending assemblage with BLAST score greater than 300 and percent identity greaterlhan 95%. 

Tables 6 and 8 sets forth the novel predictedpolypeptides (includingproteins) encoded by 
the novel polynucleotides(SEQ ID NO:2953-3936,and3949-3954)ofthe present invention, and 
their coirespondingnucleotide locations to each of SEQ ID NO: 2953-3936 and 3955-3960. Tables 
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6 and 8 also indicates the method by which the polypeptide was predicted Method A refers to a 
polypeptide obtained by using a software program called FASTY (available from 
http://fastabiQch.virp inia^ edu'> which selects a polypeptide based on a comparison of the translated 
novel polynucleotide to known polynucleotides (W.R. Pearson, Methods in Enzymology, 1 83 :63-98 
5 (1 990), herem incoiporatedby reference). Method B refers to a polypeptide obtained by using a 
software program called GenScan for human/vertebrate sequences (available from Stanford 
University, Office of Technology Licensing) that predicts the polypeptide based on a probabilistic 
model of gene structure/compositionalpmperties (C/Burge and S. Karlin, J. Mol. BioL, 268:78-94 
(1 997), incorporated herein by reference). Method C refers to a polypeptide obtained by using a 
1 0 Hyseq proprietary software program that translates the novel polynucleotide and its complementaiy 
strand into six possible amino acid sequences (forward and reverse frames) and chooses the 
polypeptide with the longest open reading frame. 

53 EXAMPLES 
Novel Nucleic Acids 

1 5 Usmg PHRAP (Univ. of Washington) or CAP4 (Paracel), fiill length gene cDNA sequences 

and their coiresponding protein sequences were generated from the assemblage. Any frame shifts 
and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked xising FASTY and/or BLAST against Genebank. Other computer programs which may 
-v'>...:have been used in?the.eciiting:process^were pfaredPhrap and Consed (University of Washington) and^' 
20 ed-ready, ed-ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences are shown in the 
Sequence Listing as SEQ ID NO:l-351. The amino acids are SEQ ID NO:985-1335. 
Table 1 shows the various tissue sources of SEQ ID NO: 1-351. 

The nearest neighbor results for SEQ ID NO: 1-351 were obtained by a BLASTP version 
2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 release 

25 21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 

homologue for SEQ ID NO: 1-351 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
with identifiable functions for SEQ ID NO: 1-351 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 

30 Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region foimd in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 
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Using the pFam software program (Sonnhammer et al.. Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
5 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process for 
identifying prokaiyotic and eukaryotic signal peptides and their cleavage sites are also disclosed by 

10 Henrik Nielson, Jacob Engelhrecht, Soren Brunak. and Gunnar von Hdjne in the pubKcation " 
Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites" 
PiotemEngineering,VoI. 10, no. l,pp. 1-6 (1997), incorporatedherein by reference. Amaximum 
S score and a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequeiices. Table 7 shows the position of the signal peptide in each of the polypeptides 

15 and the maximum score and mean score associated with that signal peptide. 
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5.4 EXAMPLE 4 
Novel Nucleic Acids 

Using PHRAP(tJmv. of Washington) or CAP4(Paracel), a fiiUleii^ " 
sequence and its corresponding protein sequence were generated from the assemblage. Any fiame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked iisingFASTY and/or BLAST against Genbank(i.e. dbEST version 117, gb pri 1 17, 
UniGene version 1 17, Genpept release 1 17). Other computerpiograms yMch may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). Tie fidl-length nucleotide, including splice variants resulting from 
these proceduresare shown in the Sequence Listing as SEQ ID NOS: 352-766. The corresponding 
amino acids are SEQ ID NO: 1336-1750. 

Table 1 shows the various tissue sources of SEQ ID NO: 352-766. 
The nearest neighbor results for SEQ ID NO: 352-766 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release 21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 352-766 from Genpept . Tlie translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs witii 
identifiable functions for SEQ ID NO: 352-766 are shown in Table 2 below. 
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Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
BioL, Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
S the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al.. Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
1 0 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine fi"om using Neural Network SignalP VI. 1 program (firom 
Center for Biological Sequence Analysis, The Technical University of Deimiark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
15 disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Guimar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, VoL 1 0, no. 1 , pp. 1 -6 (1 997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
' " " ^ " ^^20 ' each of the polypeptides "ahH the maximum~scbre'aiid'mean score associated w tEarsignkl ' " 
peptide. 

5.5 EXAMPLES 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full lengtii gene cDNA 
25 sequence and its corresponding protein sequence were generated from the assemblage. Any fi^me 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 1 8, gb pri 1 1 8, 
UniGene version 118, Genpept release 118). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready , ed- 
30 ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 767-930. The corresponding 
amino acid sequences are SEQIDNO:1751-1914. 

Table 1 shows the various tissue sources of SEQ ID NO: 767-930. 
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The homology results for SEQ ID NO: 767-930 were obtained by a BLASTP version 
2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 release 
21(Derwent), using BLAST algorithm. The nearest neighbor result showed the homologs for 
SEQ ID NO: 767-930 from Genpept. The translated amino acid sequences for which the nucleic 
5 acid sequence encodes are shown in the Sequence Listing. The homologues with identifiable 
functions for SEQ ID NO: 767-930 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol.. Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether Uiey had identifiable signature regions. Table 3 shows the 
10 signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhamnier et al.. Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biologic^ Sequence Analyses,, TTiei Tesphnic^ University of Denmark). The process 
for identifying prokaryotic and eukaryo^tic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
each of the polypq)tides and the maximum score and mean score associated with that signal 
peptide. 

5.6 EXAMPLE 6 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its correspondingprotein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 

checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 1 1 8, gb pii 1 1 8, 
UniGene version 1 1 8, Genpept release 118). Other computer programs which may have been used 
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in the editing process were phredPhrap and Consed (University of Washington) and ed-ready , ed- 
ext and gc-zip-2 (Hyseq, Inc.). The fidl-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 93 1-965. The corresponding 
amino acid sequences are shown in SEQ ID NO: 1 91 5-1 949. 
5 Table 1 shows the various tissue sources of SEQ ED NO: 93 1 -965. 

The nearest neighbor results for SEQ ID NO: 93 1-965 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 931-965 from Genpept . The translated amino acid sequences for 
10 which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
with identifiable functions for SEQ ID NO: 93 1-965 are shown in Table 2 below. 

Usmg eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
15 signature region found in the indicated polypeptide sequences, the description of the signature, 
.the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains^vitih. hom^ certain peptide domains. Table 4 shows the name of 

20 " thfe domain fourid,"the 'description^ t^^ and the pFam score for the identified domain * " 

within the sequence. * . 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network Signal? VI .1 program (from 
Center for Biological Sequence Analysis, The Technical University of Deimaiark). The process 
25 for identifying prokaryotic and eukaryotic signal peptides and flieir cleavage sites are also 

disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Guimar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
- cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incoiporated herein by 
reference. A maximum S score and a mean S score, -as described in the Nielson et as reference, 
30 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.7 EXAMPLE 7 
Novel Nucleic Adds 
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Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a fiiU length gene cDNA 
sequence and its corresponding protein sequence were generated fiom the assemblage. Any fiame 
shife and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 11 9, gb pri 1 19, 
5 UniGeneversion 1 19, Genpept release 1 19). Other computerprograms which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-lenglh nucleotide, including splice variants resulting fiom 
these procedures are shown in the Sequence Listing as SEQ ID NOS:966-974. The corresponding 
amino add sequences are SEQ ID NO: 1950- 1958. 
1 0 Table 1 shows the various tissue sources of SEQ ID NO: 966-974. 

The nearest neighbor results for SEQ ID NO: 966-974 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release (Derwent). using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 966-974 fiom Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
wiUi identifiable fiinctions for SEQ ID NO: 966-974 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu at al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine^ whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences; the description of the signature, ' - 
the eMatrix p.value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al.. Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
llieir cleavage sites can be determine from using Neural Network Signal? Vl.l program (fiom 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
pubUcation " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10. no. 1, pp. 1-6 (1997). incorporated herein by 
reference. A maximum S score and a mean S score, as described in ibs Nielson et as reference, 
was obtained for the polypeptide sequences. Table 7 shovre the position of the signal peptide in 
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each of the polypeptides and the maximlun score and mean score associated with that signal 

peptide. 



5.8 EXAMPLES 
Novel Nucleic Adds 

5 Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length g^e cDNA 

sequence and its corresponding protein sequence \yere generated fix)m the assemblage. Any &ame 
shifts and incorrect stop codons were corrected by hand editing. Durii^ editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 120, gb pri 1 20, 
UniGene version 1 20, Genpept release 120). Other computer programs which may have been used 
10 in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-2ip-2 (Hy seq. Inc.). The fulUength nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS:975"984. The corresponding 
amino acid sequences are SEQ ID NO: 1 959- 1 968. 

Table 1 shows the various tissue sources of SEQ ID NO: 975-984. 
1 5 The nearest neighbor results for SEQ ID NO: 975-984 were obtained by a BLAST? 

version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 21, 2000 
release (Derwent), using BLAST algorithm. The nearest neighbor resiilt showed the closest 
homologue for SEQ ID NO: 975-984 from Genpept . The translated anuno acid sequences for 
q j . \ :: -.which the. nucleic acid sequence encodes are: shown in the Sequence Listing*: ^The hdmologs^ . 
20 with identifiable functions for SEQ ID NO: 975-984 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
25 the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al.. Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herem incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain foxmd, the description, the p-value and the pFam score for the identified domain 
30 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network Signal? VI . 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
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disclosed by Hennk Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaiyotic and eukaiyotic signal peptides and prediction of iheir 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
5 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
eadi of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5^ EXAMPLE 9 
Novel Nucleic Acids 

1 0 Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 

sequence and its correspondingpiotein sequence were generatedfiom the assemblage. Any fiame 
shifts and incorrect stop codons were corrected by hand editing. During editmg, the sequence was 

checkedusingFASTYand/orBLASTagainstGenbank(i.e.dbESTversionl20,gbpril20, 
UniGene version 120, Genpept release 120). Other computer programs which may have been used 
in Ihe editing process were phredPfarap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including spHce variants resulting fiom 
these procedures are shown in the Sequence Listing as SEQ ID NOS:3937-3942. The 
cotre^ondingpeptide sequence is SEQ ID NO: 3943-3948. 

Table 1 shows the various tissue^sources of SEQ ID NO: 3937-3942. 
The nearest neighbor results for SEQ ID NO: 3937-3942 were obtained by a BLASTP 
version 2.0al 19MP-WashU search agamst Genpept release 120 and Geneseq October 12, 2000 
release 21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 3937-3942 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in Ihe Sequence Listing. The homologs 
) with identifiable functions for SEQ ID NO: 3937-3942 are shown in Table 9 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219^235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 10 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within' the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) aU the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 1 1 shows the name of 



105 



wo 01/57190 PCT/USOl/04098 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network Signal? VLl program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaiyotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engmeering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtamed for the polypeptide sequences. Table 12 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

Tables 5 and 13 are correlation tables of all of the sequences and the SEQ ID NOS. 



TABLE 1 



Tissue Origin 


RNA 


Library 
Name • 


SEQ ID NOS: 


lung 






3 1125 49 65 75 114141 156160172 
190 198 209 217 224 229 234-235 267 
269 274 277 282 284 303 308 3 12 320 
334 336 352 372 396 398 412 414 437 
453 464 470 481 492-494 508-509 532 
539 581 584 617-619 621 628 633 643 
688 691 745 752 761 768 794 822 837 
848 876 887 953 967 973 


adult brain 


GIBCO 


AB3001 


1 3 12-13 16 22-24 28-29 41 48 58 65 78 
82 89-90 94 97 103 112 114-115 117 120 
122 130-131 168 181 184 186-187 189- 
190 198 208 216 247 249 259 270 277 
297 301 308 312 314 321 333 348 374 
396 403 406 410 412 416-417 420 423 
426-427 431 456 474 481 484-485 488 
498 500 508-509 530 549 553 558 563- 
564 583 596 602-603 608 612 621-622 
624 643 650 674 699 71 1 736 738-739 
753 770 779-780 785-786 802-803 816 
822 839 842 848 859 861 871 893-894 
897 900 903 925 954 958 967 969 


adult brain 


GIBCO 


ABD003 


3 19 21-25 28-29 31 33-34 37 39 41 46-48 
53 58 63-64 66 72 78 80 99 103 109-1 10 
112 114 118 120-124 126 132-133 135 
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139 143 146 148-149 159 163 168 174 
176 179-180 184-185 188-190 202 208- 
209 216-217 221 223 230 234-235 240 
244 249 251 253 255 258-259 263 269- 
270 277 282 285-286 290 294-295 297 
301-302 304-305 307-308 311-312 314 
320 329 333 335-336 342 344 346 349 
354 358 365 370 373-374 377 380 382- 
383 388 394-396 399 401-402 406 409- 
410 413 416 420-421 425 428 430-431 
436-437 442 456 462 464 466-467 474 
484 486 495-496 500-501 506 508-509 
519 530 537 542 549 561-562 564 572 
574 577-578 580-583 586-587 589 592- 
593 596-597 601 608 610 612-614 617- 
624 630-632 635 637 650 658 663-664 
668 676 679 681 689-690 693 699 724 
726 732 736 742-743 747 767-770 780 
784 789 793 799 802-805 813 817-818 
822 824 829-831 837 839 845 848 856 
859-860 864 871-872 875-876 881 887 
896-897 901 903 907 910-911 925 930 
933 943-944 947 952-953 958 962-963 
965 967 972 977 


adult brain 


Clontech 


ABROOl 


3 53 66 113 115 126 135 160 172 179 185 
204 263 273 305 312 323 358 380 383 
395-396 403 420 428-429 431 461 542 
583 586 606-607 (61 1 620 645-646 688 
690 715 732:736 740 748 754 768 784- 
786 790 796 800 878 897 906-907 947 
977 


adult brain 


Clontech 


ABR006 


19 32 49 53 60 72 91 103 118 125 130- 
131 134 184 224 275 338 350 354 361- 
363 374 384 390 394 396 431-432 434- 
435 445 468 549 621 732 734-736 745 
760-761 764 768-769 775 787 806 81 1 
818 887 903 906 918 930 942 947 957 
973 977 


adult brain 


Clontech 


ABR008 


2-3 9-11 14 17 21 23-25 28-29 31-35 37 
41-42 45 47-48 56-57 65-66 69-70 72 75 
77-78 88 91-92 97-99 101 103 1 12-1 15 
118-128 130-131 135 138-140 142 144- 
146 148 152 156-157 159-160 163 168 
172 174 176 178-180 182-190 194 196- 
198 200-201 204 209-214 218 220-225 
228-230 232-233 238-240 243-244 246 
254-256 260-264 270 272-274 278-279 
282-285 289-291 293-294 296-297 301 
303-306 312-314 317 321-322 325-328 
334 336 338 340-342 344 346 348 350- 
352 354 356-358 363 366 369-374 376 
379-381 383-386 388-394 398-399 402- 
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403 405 409-412 414 418-421 423-424 
426-427 430 433-437 443 445-450 452 
456-457 460 462 464 471 479 482-483 
485 488 490-498 505 507 510 516 519- 
522 524 527-532 535 538-539 542-545 
548 551 553 555 561-562 566 569 571 
574 580-583 588-589 593 597 601-608 
611-612 614-615 617-618 621-622 624 
630-635 642 644 646-648 650-652 655 
657 659-661 664-665 668 672 674 689 
693-699 701-702 708 711 715 717 724 
728-730 732 734-735 738-740 745 747- 
750 753-755 757 761 763-764 766-769 
772-773 775 780-781 789-791 793-795 
799-800 802-806 809 812 818-819 821- 
822 826 829-830 832 834-835 841 843 
845 856 858-859 861 864 866 870 872 
876 880 883 885 887 893-898 902 906- 
916 918 921 925-926 930-931 933 942- 
943 946 948 950-951 953-954 958-960 
962-965 967 969-970 972 977 


adult brain 


Clontech 


ABROU 


57 196 270 304 344 436 834 


adult brain 


BioChain 


ABR012 


14 82 121-122 168 691 


adult brain 


Invitrogen 


ABR013 


72 108 263 270 336 425 492-494 732 787 

790 826 880 


adult brain 


Invitrogen 


ABR014 


293 394 399 -764 768-769 928 967 


adult brain 


Invitrogen 


ABR015 , 


738-739 764 


-adult brain 


Invitrogen 


ABR016. 


320 374 396 399 405 684 742-743 767 

931 947 967 -'-"r"-' " ' ' • ' 


adult brain 


Invitrogen 


ABT004 


21 33-34 37-38 47 52 57-58 69 72 91-93 
109 119 122-124 126-127 135 142-143 
158 167-168 185-188 194 200 212 232 
242 246 255 258 270 277 279 293 301 
312-313 319 322-323 331 341 346 348 
371 374 388 391 394 399 401 409 41 1 
429 436-437 456 462 477 488 496 498 
510 512 515 539 542 545 549 559 563 
573 579 587 589 601-605 612 620-621 
624 640 643 647 681 715 723 728 732 
735-736 740 745 748 753 766 785-786 
792-793 797-801 812 822 829-831 853- 
856 859 876-877 884 893-894 908-909 
918 925 933 950 969 978 


cultured 
preadipocytes • 


Strategene 


ADPOOl 


4 28-29 69 93 114 121 132-133 135 151- 
152 159 167 172 178 181 184 190 194- 
195 203-204 209 217 219 240 248 260- 
262 267 273-274 277 282 297 301 304 
312 314 326-327 361-362 371 374 388 
394 401 403 405 41 1 420 437 453 466- 
467 470 474 478 496 507-509 517 530 
532-533 584 588 593 602-603 608 610 
617-621 630-631 633 639 642-643 661 
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693 729 746 761 765 769 834 842 848 
887 907 923 947-950 957 967 969 


adrenal gland 


Clontech 


ADR002 


1 3 12-13 21 23-24 27-29 67 74 78 103- 
105 108-109 113 115 118 120-121 128- 
133 149156 160172177182 214217 
223 232-233 247 254 269-270 273-274 
277 283 285 288 298-299 308 317 319 
328 338 340 342 361-362 364 372 376- 
377 382 384401-402 405-406 416420 
43 1 437 444 446 448 457 462 484 500 

507 517 524 532-533 539 545 554 561- 
562 564 588 597 602-603 606-607 635 
642 646 649 658 664 674 693 703 730 
740 745 752 759 765 767 775 779 799 
809 817-818 839 845 856 859 863 887 
890-891 896 948 953 958 961-963 973 | 


adult heart 


GBCO 


AHROOl 


1 3-4 8 10 14 20-21 25 28-29 33-34 37-38 
41 48 54-57 65 69-72 75 78 80 82-83 97 
99-100 108 112-115 117-121 123-124 
128-133 141 144-146 149 152 159 162- 
163 168 172 176 179 181 184 186-187 
190-191 201 203 208-209 212 216-218 
221223 227 229 233 244 247 249 253- 
255 258 263-264 267 269-270 274 278 
280-282 285 289 291 295 297-299 301 
303-304 308 313 317 321-322 326 328 
334 344 348 352 358 361-363 370-371 
380 382-383 388 394-396 398 401 403 
405-406 410-416 423 425-427 430-431 
436 452-453 464-465 470-474 481-484 
487-488 490 492-494 496 499-500 505- 
506 508-509 514 523 529-530 533 547- 
548 553 558 563-565 577-578 586-588 
590 593 597 601-603 606-608 610-613 
617-619 621-622 626-628 637-638 642- 
644 652 658 661 672 682-683 688 691 
693 697 699 708 71 1 713 715 732 737 
745 747-748 750-753 759 761 765 768- 
770 775 790 802-803 814-815 818-819 
830 837 839-840 842 845 848 859 861- 
862 867 876-877 887 891-892 896 900- 
901 903 905-906 908-909 919-920 922 
925 928 936 939-940 946-947 950 953 
959 967 970-971 973 977 | 


adult kidney 


GIBCO 


AKDOOl 


1.3 8 12-14 17 19-25 28-29 33-34 37-39 
41 46-48 50 52 55-60 62 65-67 69 71-72 
75 77-78 82 84 89-90 93 97 108-1 10 1 14- 
lib 118-121 123-125 128 130-133 135 
138 144 146 149 156 159-161 163-164 
167-172 176 179 184 186-187 189-190 
194 196 200-202 204 209 21 1-212 216- 
2 1 7 2 1 9 22 1 223-224 229 232-235 244 
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247 250 253 255-256 258 263-264 268- 
272 274 277-281 283 286 288-290 292 
294-295 297 301 303-309 311-314 316 
319-323 325 328-338 342 348-349 352 
354-355 358 361-363 365 370-371 373 
376-378 380 382-383 388 395-399 401- 
403 405-406 409-413 416 418-420 425- 
428 430-43 1 440 442 452-454 462 464- 
465 470 472-474 477 479 481 483-485 
487-489 492-495 498-500 504 506 510 
517 522 525 529-530 532-533 539 542- 
543 547 551-552 558 560-564 569-570 
573-574 577-578 580-583 585-590 594- 
596 601-608 610-613 617-621 624 626- 
628 630-63 1 634-636 639 642-643 648 
652 656 658 664-665 676-677 679 681 
688-691 693 697 699 708 711 715 717 
720-722 724 729-732 738-741 747-748 
751-753 761 765 770-778 780 784 789 
791 793 797 804 813 817 823-824 834 
837 839 842-843 845 848 859 861-862 
864 867 870 876-877 887 889 892-894 
896-897 900-901 903 907 913-915 918 
921 923 925 929-930 932 939 942 946- 
947 949-950 953 958-959 961-963 967 
969 972 977 


adult kidney 


Invitrogen 


AK.T002 


1 3 16 21 30 32 35 38-41 46-47 56 77 92 
109 123-124 130-131 146 149 161 167- .. 
168 172 176 190 209212 234-235 258' • 
279 292 301 303 308 314 333 355 363 
372 380 383 396 399 402 418-419 426- 
427 431 448 454 461 471-474 488-489 " 
495 498 504 506 508-509 520-521 530 
537 539-541 545 547 563 582-583 592 
613 617-618 621 623-624 633 655 688 
690 693 699 704 713 732 745 752-753 
761 766-768 770 784 789 797 837 842 
848-849 866-867 877 887 893-894 903 
914-915 925 929-930 937 944-945 947- 
949 955 961 967 98^4 


adult lung 


GIBCO 


ALGOOl 


1 3 14 18 28-29 38 54-56 59 92 110 114- 
115 130-131 146 149 156 159 164 167 
176 184 209 217 234-236 240 255-256 
258 263-264 269 271 276 280-281 297 
305 308 312 314 322 325 332 336 344 
353 361-362 388 401 410420-421 426- 
427 431 465 469 474 484 498 500 506 
508-509 517 530 532 573 592 596 613 
619-620 623 626-628 638 658 679 681 . 
684 689 717 731 741 771 791 799 817 
834 845 861-862 864 875-876 901 921 
925 928 932 940 947 949 959 962-963 
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967 


lymph node 


Clontech 


ALNOOl 


3 10 110 146 160 168 196 209 221 269 
278 301 336 348 394 405 411 420 422 
459 464 474 485 503 506-507 532 563 
582 619 623 630-631 642 669 684 697 
713 715 727 747 767 769 789 825 839 
842 849 887 896 913 921 925 


young liver 


GBCO 


ALVOOl 


3 14 16 37-38 41 51 56 60 97 104-105 
108 110 117 119 128 130-131 134 139 
149 152 169-172 176 184 189-190 200 
209 212 216 218 228 232 255 258 263 
270-271 275 285-286 292 295 298-299 
301 304 314 341 358 365 368 376 400 
410-412 431 474 481-482 485 496 500 
504-505 517 520-522 524 530 532-533 
547 551 563 581 583 610-611 621 624 
635 643 691 708 71 1 715 720 752 755 
761 768 796-797 811 818 830 845-847 
852 864-865 867-869 896 899 910-91 1 
949 958 965 969 972-973 


adult liver 


Invitrogen 


ALV002 


3 37 42 56 60 71 82 104-105 114-115 
117-118 125 130-131 134-135 164 169- 
172 176 179 200 203-204 212 217 223 
226 232 237 244 263 274-275 292 301 
310-312 314 317 349 354 364 368 372 
376 398-399 402 426-427 439 442 451 
458 465 474 482 485 490 506 515 525 
527 545 547 552 568 571 573-575 582 
587 594-595 604-605 608 610 621 630- 
631 634-635 637 657 664 690 693 699 
723 726 745 751 763 767 784 793 81 1 
822 845 848 852 856 861-862 864 892 
899 908-909 925 950 958 967 983 


adult liver 


Clontech 


ALV003 


60 134 169-171 275 


adult ovary 


Invitrogen 


AOVOOl 

i 


1 3 9-10 12-14 16 18 20 22-25 28-29 33- 
35 37 39 41-42 46 48-50 55-57 59 63-67 
69 71-72 75 77-80 82 88-89 92 101 103- 
106 108-110 113 115 119-121 123-126 
128-133 135 138 142-146 149 151-152 
159-161 167-168 172 174 176-177 179 
181 184-190 194 198 200 203 208-209 
21 1-212 214 217 219 221 224 226 232- 
235 240-242 246-247 249 251 254-255 
258-259 264 269-271 274 276-277 279- 
283 285 288 290 293-294 297 301-304 
306-308 31 1 314 319-322 325-326 328- 
329 331-332 335-338 341-342 344 348 
354-358 361-363 365 368 370-372 374 
376 379-380 382-383 388 394-396 398- 
399 401-402 405-406 409-412 416 418- 
*21 423 425-433 438 442-443 449-452 
t54 462 464 466-467 469-471 474 479 
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482-484 488 490 492-496 498 500-504 
506-509 511 515-518 520-524 529-530 
532-533 537 539-542 545 551 555 558 
560-565 569 571 573 577-578 581-583 
585-590 592-593 596-597 600-605 608 
610-611 613-614 617-628 633-637 639 
642-643 646-648 650 652 654 656 658 
664 668-670 672 674 679 681 684 688 
691 693 697-699 701-702 713 717 721- 
722 724 729-732 738-744 747-750 752- 
753 755 759 761 765 767-774 779-780 
783-784 789 793 795-797 801 813-818 
823-824 828 830-832 834 837 839 841- 
842 845 848-851 856 859 862 864 866- 
867 870-871 874-878 881-883 887-889 
891 893-894 896-897 901 903 906-911 
913 919-922 925 928 930 936 939-940 
943.944 946-947 949-950 952-953 955 
957-958 962-963 965 967 969 971 973 
977 981-982 


adult placenta 


Invitrogen 


APLOOl 


41 56 67 253 301 304 334 380 383 451 
474 479 500 577-578 643 648 729 767 
856 859 866 873 962-963 


placenta 


Invitrogen 


APL002 


3 21 31 38 63-64 78 135 143 168 186-187 
212 232 244 263 280-281 334 336 344 
348 371 374 394 399 461 490 582 588 
602-607 610 620 699 745 769 793 817 
. 822 859 897.898.923 .928 §31 943 949- 
•969 973. ■ . . i 


adult spleen 


GIBCO 


ASPOOl 


1 3 21-22 46'52 54-55 57-58 61-62 72 74 
78 82 88 118 121 130-131 137 152 159 
168 172 189 203 209 217 223 234-235 
252 255 263 269 271 274 282 288 290 
301 314 322 335 350 363 394 403 405- 
406 410-412 415 431 459 464 472-474 
482 488 500 506 510 514 517 532 537 
542 561-563 589 593 602-603 610 613 
619 621 636 642-643 655 658 662 674 
676 679 681-682 684 689 691-692 697 
699 715 720 723 729 747-748 769-770 
782 793 818 830 834 845 856 859 862 
877 887 893-894 896 903 906-907 914- 
915 918 925 928 930 940 946 965 967 
977 982 


testis 


GIBCO 


ATSOOl 


6 22 28-29 33-34 41 48 52 62 65 72 97 
106 109 118 132-133 145-146 168 172 
176183 185 189-191 195 209 211-212 
214 221 223 230 254-255 258 263 269 
283 297 312 3 14 321 342 352 361-362 
365 380 383 388 395 401 405-406 412 
430-431 441 469-470 474 479 495-496 
500 506 520-521 533 543 545 548 560 
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563 574 582 589-590 693 608 616-618 
620 623-624 638 642-643 697 699 708 
71 1 745 747-748 765 767-768 779 784 
789 812-813 834 837 839 848 859 862 
868-869 875-877 887 889 893-894 896 
928 944 947 953-955 972 981 



Genomic DNA 
ftomBAC 
63118 



Research 
Genetics 
(CITB BAC 
Library) 



BACOOl 



515 



Genomic DNA 
from BAC 
39316 



Genomic DNA 
from BAC 
39316 



Research 
Genetics 
(CITB BAC 
Library) 



BAC002 



640 



adiilt bladder 



Research 
Genetics 
(CITB BAC 
Library) 
Invitrogen 



BAC003 



640 



BLDOOl 



bone marrow Clontech BMDOOl 



50 55 66 71 111 143-144 148 160 201 209 
223 255-256 280-281 286 305 315 319 
340 394 431 442 488 497 505 518 552 
588-589 621 636 664 676 715 738-739 
769 790 824 837 845 877 887 936 940 
948 962-963 967 



bone marrow Clontech 



BMD002 



3 10-13 16 18 20-21 25 28-29 31-34 41 45 
48 52 54-55 57 59 61 65 67 72-73 75 78 
80 82 84 99103 108 110 114-115 118- 
120 123-124 128 130-133 143-144 148 
152 159-161 163 168 172 174 176 178 
190 192 198 203 209 21 1 217-218 221 
223-224 227 233-236 244 247 249 252 
254 258 260-262 267 269 272 278 280- 
281 284-285 288 290 294-297 301 304 
308 314 317-318 320-321 325 328-330 
333-335 349 351-354 358 363 365 367 
377 382 388.394-397 400 405 408 410- 
412 418-421 425-428 431 433 435 442 
449-450 453 455 459 464 468-470 474 
478-479 481 484 490 496 504 506 508- 
509 511 519-521 530 532 539 553 558- 
559 561-563 580 582 586 592 599 608 
610 613-614 617-619 623 625-628 635 
638 641-643 658 664 672 682 699 71 1 
713 717 731 734 740 742-743 745 761 
768-771 774 776-778 784 787 789 813 
817-818 822 834 839-840 842 848 862 
866 870 876 885-887 891 896-898 900 
903 906 913 919 921-922 927-928 939 
944 947 950 953 959 961-963 967-968 
970 973 977 



3 9-10 15-19 30 33-34 39 45 54 57 63-64 
71 82102116119 130-133 148 152 156 
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159-160 168 176 182 224 254-255 271- 
272 282 285 290 297-299 301 305 323 
333 340 344 351-355 358 361-362 364 
367 370 372 387 394-395 399 403 405 
409 41 1 449-450 459 461 468 474 488- 
489 524 530 532 580-582 592 602-603 
611 617-618 621-622 630-632 642 661 
663 694 717 730 734 740 745 752 755 
761 767 769-771 775-778 784 787 811 
813 818 832 840 842 849 859 878 887 
893-894 896-898 903 906 908-909 923 
928 944 946-949 953 958-963 965 982 


bone marrow 


Clontech 


BMD004 


54 


bone marrow 


Clontech 


.BMD007 


766 887 928 


adult colon 


Invitrogen 


CLNOOl 


22 37 67 97 117 121 148-149 168 172 190 
200 204-205 232 244 263 268 292 301- 
302 363 377 384 452 455 459 470 530 
582 602-603 619 687 723 728 751 761 
831 861 887 914-916 934 955 969 984 


Mixture of 16 
tissues - 
mRNAs* 


Various 
Vendors* 


CTL016 


358 740 760 


Mixture of 16 
tissues - 
mRNAs* 


Various 
Vendors* 


CTL02I 


468 527 928 


adult cervix ' 


BioChain 


CVXOOl 


1 3 10 14 22 28-30 37 41 47-48 51-52 54- 
57 71 82 89-90 92 106 108 110-111 117- 
118 121 .129-131 135 141 143-146 160- 
161 164 168 172 177 189-190 193 195 
200 204 209 211-212 217 226 229-230 
232 234-235 240-242 246 254 260-263 
268-270 274 277 282 285 292 295 297 
305-308 314-316 319 328 343-344 348 
354 358 363 368 380 382-384 389 394 
396 399 401 405-407 410 416 418-421 
428 430-43 1 437 442 453-454 459 464 
469 471-473 476 480 484 492-495 500 
504 506-509 516-517 526 530 532 545 
550-551 563-565 569 577-578 585-586 
590 608 611 613 619 621 623 628 630- 
631 634-637 641 643 648 656-658 664- 
665 674 679 682 689-690 693 700 703 
708 713 721-722 724 728 732 742-743 
747 750 752 755 757 761 763 767-769 







* The 16 tissue-mRNAs and their vendor source, are as follows: 1) Normal adult brain mRNA (Invitrogen), 2) 
normal adult kidney mRNA (Invitrogen), 3) normal adult liver mRNA (Invitrogen), 4) normal fetal brain mRNA 
(Invitrogen), 5) normal fetal kidney mRNA (Invitrogen), 6) normal fetal liver mRNA (Invitrogen), 7) normal fetal 
skin mRNA (Invitrogen), 8) human adrenal gland mRNA (Clontech). 9) human bone marrow mRNA (Clontech), 
10) human leukemia lymphablastic mRNA (Clontech). 11) human thymus mRNA (Clontech), 12) human lymph 
node mRNA (Clontech), 13) human spinal cord mRNA (Clontech), 14) human thyroid mRNA (Clontech), 15) 
human esophagus mRNA (BioChain), 16) human conceptional umbilical cord mRNA (BioChain). 
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diaphragm 



BioChain 



endothelial 
cells 



Strategene 



DIA002 



EDTOOl 



779-780 784 788 HlO-Hll 813-815 822 
834 836-837 839 848 861 866-867 871 
874 877 887 891-894 897-898 901 913 
916 919 921-922 925 946-947 953 958- 
959 967 969 973 



3 39 184 203 431 563 848 967 



3 6 8-10 14 19-24 28-29 33-34 37 39 41 
46 48 52 55-58 62-65 67 69 71-72 75 78 
80 82-83 87 101-102 108-109 114-115 
117 123-124 128 130-133 135 138 143 
145-146 149 156 159-160 167-168 172 
174 176-177 179 181 184-187 189-190 
194-195 200 203 208-209 212 216-217 
219 223-224 226-227 229 234-235 244 
248-249 254-256 258 263-264 267 269 
271 274 276-282 285 290-291 294 297 
301-304 308 311 313-314 316-317 320- 
321 323 325-326 328-329 331-332 334- 
337 339-341 344 348-349 352 354-355 
358 361-363 365 367 371-372 375 379- 
380 383 389 394-395 398-403 405-406 
409-412 425-428 437 442-443 448 454 
464 466-467 474 479 481 490 492-498 
500 503 506-509 511 517 520-521 523- 
524 530 532 537 540-542 558 561-563 
565 569-570 573 581-583 586 588-589 
596 602-608 610-611 613 617-622 625 
628 630-631 633-637 642-643 646 648 
650 652 659 661-662 682 688 690-693 
696 69S-699 708 712 715 717 720-722 
724 727 729 740 745 748-750 752 761 
765 767-770 772-773 779 784 789 792- 
794 796 802-803 811 817-818 821 824 
827-828 830 834-835 837 842 845 848 
859 861-862 864 866-867 870 876 885 
887 891 893-894 897-898 900 903 906- 
907 913 916 921 925 939 947 950 953 
955 957-958 962-963 967 973 978 984 



Genomic 
clones from the 
short ann of 
chromosome 8 



esophagus 
fetal brain 



fetal brain 



Genomic 
DNAfrom 
Genetic 
Research 



BioChain 



EPMOOl 



324 515 640 



Clontech 



Clontech 



ESO002 



FBROOl 



FBR004 



97 103 128 371 474 



67 129 156 159 232 267 433 446 503 845 
952 



^8-29 185 213 277 350 384 432 485 501 
549 651 747 754 761 780 787 848 870 
887 906 958 



fetal brain 



Clontech 



FBR006 



10-1 1 14 21 30 32 47 49 56 65 69 72 77- 
78 82 84 97101 115 118 121 125 128 
130-131 138 142 148 152 159-160 179 
185188 194 197 203 210 212 214 219 
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222 227-229 243-246 249 252 256 264 
270 273 282 285 290-291 293 301-303 
305-306 312 321-322 325 327 339-340 
344 346 350 354-357 363 367-371 374 
388 391 394-395 399 402 405-406 410 
414 420 426-427 436-437 442 444 454 
456-457 460 462 464 470 480 485 492- 
494 507 510 516 524 528 530-532 539- 
542 549 553-554 561-562 580-582 588- 
589 602-608 611 615 617-619 621-622 
624 632 636 641-642 646-647 651-653 
661-662 666-669 672 677 691 715-716 
730 735 740 752 754 761 767-770 772- 
775 780-781 799-801 808 818 822-823 
835 843 845 856 859 864 867 876 880 
885 887 890 893-894 896 913 918 926 
942 946-947 951 957-959 962-963 970- 
971 


fetal brain 


Clontech 


FBRs03 


130-131 312 517 637 691 738-739 


fetal brain 


Invitrogen 


FBT002 


3 22 28-31 47 57 63-64 72 75 77-78 86 " 
94-95 97-98 126-127 135 140 143 156 
159-160 167-168 177 185 190 196 201 
203-204 214 217 230 254-255 258 267 
273-274 277 279 282-283 292 301-302 
305 312 314 323 329 346 348 367 374 
382 394 399 401 403 412 415 420 432 
437 474 482 485 495 507 513 517 527 
529-530 539-542 548 552 579 587-588 
600 6Q4^66f6i2 6^^18 621-622 624 \ 
634 642-643 647-648 650 679 689 693 
699 712 715 742-743 745 748-749 753 
768-769 793 797 829-831 834 845 848 
856 859 893-894 908-909 913 916 931 
933 940 950 967 969 


fetal heart 


Invitrogen 


FHROOl 


19 57 130-131 394 431 642 769 844 


fetal kidney 


Clontech 


FKDOOl 


3 31 33-34 38 48 54 72 160 208-209 211 
223 264 269 277 283 290 313 325 341 
348 358 396 418-420 474 484 506 508- 
509 517 520-521 532 547 553 558 567 

c C€\ con cnzT £f\o ^1*5 c^c 

569 587 596 608 610 613 619 622 626- 
627 642 679 734 745 818 843 887 896 
903 916 969 971 


fetal kidney 


Clontech 


FKD002 


19 474 726 903 


fetal kidney 


Invitrogen 


FKD007 


3 118 186-187 230 244 271 432 887 969 


fetal lung 


Clontech 


FLGOOl 


69 132-133 156 168 208-209217267269 
274-275 286 354 394 396 406 462 483- 
484 608 619 751 769 771 834 914-915 

925 


fetal lung 


Invitrogen 


FLG003 


3 8 28-29 32 39 50 66 82 88 92 168 186- 
1 87 200 204 212 226 229 246 274 309 
327 332 368 374 382 394 398 426-427 
431-432 442 485 536 555-557 587 604- 
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605 621 624 636 642-643 661 ^iTm 
724 753 769 848 859 864 877-878 896 
902 904 914-915 958 



fetallung 



Clontech 



fetal liver- 
spleen 



Columbia 
University 



FLG004 



FLSOOl 



130-131 394 664 769 942 



3 8-10 12-13 16-17 19-25 27-29 33^537 
38 41 45-46 48 52 55-58 60-67 69 71-74 
77-78 80 82 84 87-90 104-106 108-109 
112-121 123-125 128-134 138 141 143- 
146 149151 156 159 163-164 167-172 
174 176-179 181 184 186-188 190 194 
200-201 203 208-209 211-212 216-217 
219 224-227 229-230 232 234-235 237 
241 243-244 246-248 254-255 258 260- 
263 267 269-270 273-282 284-285 288- 
290 292-295 297-299 301-306 308 311- 
318 320-323 326 328 332 335 341-344 
348 352 354-359 361-365 367-368 371- 
374 376-380 382-383 388-389394-396 
398-399 401-411 413-414416 418-421 
425 428-430 432-433 437 439 442-444 
449-450 452 456-457 461-470 472-474 
478-479 481-482 484-485 487 490-494 
497-499 504-507 51 1 514-515 517-521 
523-524 526 529 532 537 540-541 547 
555 558-559 563 575 577-578 580-596 
598-599 601-603 606-608 610-613 617- 
624 626-628 630-631 634-636 639 642- 
643 647-648 654-656 663-665 672 674- 
675 679 681 684 686 688 691 693-699 
711 713 715 717 719-726 729 732-733 
738-740 745 748-749 751-753 757 759 
761 767-770 776-778 780 784 787 792- 
794 799 804 809 811 813 817-819 822- 
825 830-831 834 837 840 842 845-848 
852 856 859 861-862 865 867-869 871 
874-878 887-888 891 893-894 896-900 
903 905-911 913 916 918 923 928 930- 
931 936 939 942 944 946-950 952 958- 
959 961-963 965 967969-970 972-973 
976-977 981-983 



fetal liver- 
spleen 



Columbia 
University 



TLS002 



3 8-13 15-17 19-20 22 25 28-29 33-35 37 
41 45-46 52 54-56 60-61 63-64 66-70 73- 
74 78 80 82 92 99 104-106 108-109 112 
115-116 118 120-121 123-125 128 132- 
135 139 141 143-144 146 149 152 156 
159-161 167 169-172 174 176-177 179 
181 185 188 190 194 196-197 200 204 
212 214 216-218 223-224 226-230 232- 
235 237 246-247 252 254-255 258-263 
267 270-277 284-286 288 292 294-295 
297-299 301 303-305 308 310 314 318 
320 323 328 330-332 335-337 340 342- 
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344 352 354-355 358 361-365 367-368 
371 373-374 376-377 382 388 394-396 
398-399 401 405-406 409-411 413 418- 
421 429 431 439-440 442-444 451-452 
457 462.-463 466-468 470 474 477-479 
481 483-484 487-488 491 495 499 504 
508-509 516 519-521 524 526-528 530 
532 537 540-541 543 545-547 550-551 
553 555 560 564 568 574-575 577-578 
580-592 596-597 600 602-603 608 610- 
611 613-614 617-618 621-622 628 630- 
631 634 637 639 642 644 647 654 658- 
659 665-667 669-675 679 681 684-685 
688-690 693 695 697 708 71 1 713 715 
717-719 723-727 729 731-734 738-739 
741 745-746 749-750 753 759 761 766- 
767 769-770 776-779 782 784 791-792 
794 805 808 817-818 822 824-825 830 
834 837 842 845-849 852 856 859 864- 
865 867 874-878 888 891-892 896-900 
903 905-906 908-909 913 916 918 921 
923 925 932 936 939-940 942 944 946- 
947 949-950 953 955-956 958-959 961- 
963 965 968-970 973 977-978 981 


fetal liver- 
spleen 


Columbia 
University 


FLS003 


19 60 78 224 273 275 370 373-374 401 
602-603 639 643 730 732 738-739 748 

752 770 782 928 930 947 949 


fetal Uver 


Inyitrpgen - 


rFLVOOl. ^.j — 


-37 55 60 69 72-73 97J04-105 108 113- 
ilWfd-ifH21 135 "143 i'67-168 ' 
1 86-1 87 195 200-201 209 217 223 240 
244 253 255 275 284 301 311 314 317 
336 342 348-349 358 371 374 382 394 
402 41 1-412 418-419 428 430 442 453 
517 568-569 580 582 584 587 589 601- 
603 606-608 617-618 624 634 639 642- 
644 646 664-665 669 679 715 717 720 
726 745 748 751 769-770 782 791 794 
797 824 830-831 845-847 852 859 870 
899 913-916 925 928 948 956 958 969 
976 982 


fetal liver 


Clontech 


FLV002 


72 418-419 632 


fetal livCT 


Clontech 


FLV004 


3 160 169-171 355 367 374 376 547 617- 
618 621 646 717 741 771 836 878 976 


fetal muscle 


Invitrogen 


FMSOOl 


15 27 32 37 67 72 83 99 112 121 138 167 
174 177 186-187 190 203-204 211 215 
230 252 259 312 374 403 406 409 457 
461 485 505 517 528 530 540-541 544 
549 554 558 579-580 583 602-603 608 
639 642-643 654 664 699 715 730 737 
751 772-773 788 802-803 810 848 856 
859 864 868-869 887 893-894 905-906 
910-911923 948 967 
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fetal muscle 


Invitrogen 


FMS002 


15 99 130-131 223 361-362431 474 505 
581 639 643 666-667 784 790 808 810- 
81 1 874 880 887 903 946 950 958 962- 
963 973 


fetal skin 


Invitrogen 


FSKOOl 


3 6 20-22 32-34 41-45 47 49-52 55 63-64 
66 69 77 80 88 91 98 101 111-112115 
126 130-131 135 142 144 146 160 163 
167 176 188-190 196 201 204 208 213 
215 217-218 229 232 244 246 248 255 
263 265-269 274 279-281 283 285 288 
292 294 297 301 303 308 314 321 341- 
342 344 348 354-355 358 361-362 366 
369 371-372 374 381-382 384 386 394 
401 403 405 413 415 428 431 437 440 
460 466-467 472-473 477 481 483 495 
499 504 517 522 532 536-537 539-541 
545 556-558 569 574 576-578 580 584- 
585 587-589 592-593 602-603 606-608 
612 617-618 621 624 634 637 639 642- 
643 647 664 673-674 676 680-681 689 
699 705-707 709-715 724 728-730 738- 
740 745 748 752 765 768-769 772-773 
793 797 817 823 830 834 842 848 859 
861 864 870 874 883 887-888 893-894 
901 904 908-909 913-916 923 925 947 
950 958 962-964 967 975 


letal skin 


Invitrogen 


FSK002 


3 130-131 146 194 306 354 367 400 405 
474 489 520-521 547 558 561-562 585 
596 730 740 748 755 767 771 810 840 
893-894 946 959 


fetal spleen 


BioChain 


FSPOOl 


276563 842 


umbilical cord 


BioChain 


FUCOOl 


3 20 33-34 39 48 50 52 55-57 65 67 69 72 
77 79 82 92 109 112-113 121 132-133 
138-143 156 167-168 172 174 179 184- 
185 190 194-196 200 202-203 208-209 
229-230 244 269-271 278 284-285 290 
297-299 303 305 308 320 331-332 336 
338 342-343 363 367 372 374 379-380 
383-384 392-394 397 399 402 405-406 
410 425-427 429-430 449-450 474 476 
484 497 499 501 504-505 510 515 517 
532-533 539 549 551 558 563 569 574 
577-578 581 586-587 597 602-603 608 
610 617-619 621 626-627 634-637 639 
642-643 658 663-664 674 690-691 693- 
694 699 713 715-717 720 724 726 729 
738-739 746-747 749 759 761 765 768- 
/oy 774-775 793 797 807 818 822 837 
848-849 856 862 868-869 874 885 887 
892-894 903 906-907 916-917 919-920 
928 936 939 944 946-947 962-963 967 
969 
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fetal brain 


GIBCO 

■ - • 


HFBOOl 


3 9-10 12-14 16 21 25 28-30 32-34 37-39 
41 47-48 52-53 56 65 67 69 71-72 75 80 
84 92 97 103 106 110 114 117-119 123- 
124 127 129 132-133 135 138 141-142 
144-146 148-149 152 156 159-160 168 
172 174 176 179 181 184-185 190 198 
208-209 212 214 219 221 223-224 229- 
230 233-236 240 244 247 251 253-255 
258-259 270 273 276-277 285 297 304- 
305 308 312 314 322-323 325 328 332- 
333 335-337 339-340 342-344 346 352 
354 358 363 365 370-372 374 382 394- 
396 398 401 403 405-406 409-412 414 
416 425-427 431-432 437 442 445 453 
456 462 466-467 469-470 472-474 479 
483 488 490 492-497 500-501 504 506- 
510 520-521 524 530 537 539 545 549 
552 558 560-562 564 569 579 582-583 
586-587 596 602-608 610-612 614 617- 
624 626-628 630-631 633 635 638 641 
643 647-648 656 658 661 676 679 688- 

f\M fiO"? Mfi-f\Ql 71 1 -71 9 71 S 794 79fi 
7'^1 1'^'^ 74S 747-740 7^9 7^4 7/51 Ifi'i ■ 
767-770 774 779-781 784-786 789 799- 
800 802-803 813 818-819 823-824 831 
834-835 837 839 845 848 859 864 866- 
867 871 874-875 881 887 891 893-894 








896^897 900 906-907 910-911 918 921- 








922' 925 927-928 930 943-944 946-947 ' 
950 953 962-963 965 969 972-973 977 


macrophage 


Invitrogen 


HMPOOl 


86 168 186-187 297 537 608 681 761 845 
877 


infant brain 


Columbia 
University 


IB2002 • 


2-3 9-10 12-14 16 21 25 27-30 32 37-38 
46-47 49 55-56 58 65 69 71-72 78-79 82 
84-86 91-92 98-99 106 109-110 113-115 
118 127-128 130-133 135 138 142 144 
151 156 168 173-176 180-181 185-188 
192 194 196-201 203 208 210-212 214 
217-218 224 229-231233 236 238 240- 
241 244 246 251-256 259 263 270-271 
277-279 284-285 287 293-294 296 301- 
302 308 312-314 317 322-323 327 330 
333 339 342 345-346 351 354 358 361- 
362 365-366 368 370-371 373-374 382 
388 394-396 402 405-406 411-412 415- 
416 420 424-425 428 431 436-437 440- 
441 444-445 453 456 460 465 474 479 
482-483 488 495-496 498 501 503-504 
506-510 515-517 520-521 524-525 529 
531-532 534-535 537 539-542 544-545 
549 561-562 569 574 577-578 580-583 
586-587 589 592 596 600-608 610 612- 
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613 616-618 620 622 624 629-632 634- 
635 637 641 643-644 650-651 653 661 
663-664 676-677 689 693 695-698 708 
71 1 720-722.724 730 732 735 740 745- 
748 754 765-766 768-769 779-781 785- 
786 789 791 796 798 800-803 807 811- 
813 818-819 822-824 830-831 834-835 
837 839 842-843 845 854 856 858 864 
867-869 875-877 879 881 887 892-894 
896 903 907-91 1 913 916 919-920 925 
930-932 936 939 943 946-947 953 958 
970-973 977-978 982 984 


infant brain 


Columbia 
University 


IB2003 


3 12-13 21 27-29 32 39 49 69 72 82 91 
113 116 126 128 132-133 142 144 156 
176-177 184-185 188 194 208 212223- 
224 228 230 244 255 259 267 270 273 
276 293-294 312 320 326-327 337 342 
346 354-355 358 361-363 382 388 390 
394 396 399 402 420 425 431 442 462 
474 482 484 488 495-496 510 520-522 
524 529 540-541 549 563 582 586 588- 
589 596 600-603 606-607 612 617-618 
620-621 632 647 650 679 720-722 724 
735-736 746 751 754 769 785-786 793 
800 807 811-813 818-819 822 824 831 
834 838-840 843 856 864 892 896 907 
919-920 925 930-931 936 947 950 957 
973 982 


- infant brain - . . 


Columbia: 
University • 


.IBM002.. 


,16.47 82 84 201 263 302 376 394 421 440 
488 537 592 606-607 635 740 769 887 
892 906 921 926 971 


infant brain 


Columbia 
University 


IBSOOl 


84 86 180 185 198 201 203 230279 312 
326 346 354 366 388 488 542 581 588 
620 647 664 732 740 785-786 801 807 
822 827 910-911 925 931 


lung, fibroblast 


Strategene 


LFBOOl 


3 11 25 49 65 75 114 141 156 160 172 
190 198 209 217 224 229 234-235 267 
269 274 277 282 284 303 308 312 320 
334 336 352 372 396 398 412 414 437 
453 464 470 481 492-494 508-509 532 
539 581 584 617-619 621 628 633 643 
688 691 745 752 761 768 794 822 837 
848 876 887 953 967 973 


lung tumor 


Invitrogen 


LGT002 


1 3 9-10 12-13 20 31 38 41 46 48 51-52 
56 58 63-64 72 74-75 78 82 88 101 106- 
107 110 114-115 117-118 120-121 123- 
124 128-133 135 143-146 149 151 156 
159-161 163-164 167-168 172 176 178- 
179 184-185 189-191 194-196 200 203 
209 212 216-217 226 228-229 232 234- 
236 241 246 248 256 258-259 263-264 
269-271 274 282-283 285-286 290 292 
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294 297 301 308-309 311 314 317 321 
326 328-329 331 333-334 341 348 352 
354-355 363 365 371 380 382-383 388 
394-395 398-402 405-406 410-41 1 413 
416 418-419 426-427 439 442 452-453 
458-459 461-462 464-465 470-471 474 
478 483-484 490 495-496 499 510 522 
524 528 536-537 540-541 543 548 556- 
558 560-565 571-573 580 582 587-588 
592 597 602-605 608 610 612-613 617- 
622 625-629 633-634 636 642-644 648 
661 664 669 679 688-689 691 693 699- 
700 708 717 723-724 730 733-734 738- 
740 745 747 749 752-753 761 767-768 
770 779 782 784-786 789 793-794 797 
817-818 820 823-824 834 837 842 845 
848 855 857 859 862 864 866 870 875- 
877 887 892 896 900-901 907-909 914- 
915 919-920 923-925 939 943 947 949 
953 958 962-963 965 968 970 972-973 
977 


lymphocytes 


ATCC 


LPCOOl 


3 9-1 1 32 47 50 56 71 75 88 97 99 102 

121 125 128-129 135 138 141 149 163 
167-168 212-213 217 233 255 290 294 
301 305 311 314 342 372 377 388 398- 
399 410 437 442 453 470 474 481 495 
500 506 510 529 532 537 542 558 571 
57.9 604-605 610 620 628.637 643 658 . 
6667667 676 679 697 70i 713 728 730 r •• 
734 749 765 768 796 807 818 822 834 
839 848 859 875 885 887 896 903 906 
914-915 928 947 973 981-982 


leukocyte 


GIBCO 


LUCOOl 


1 3 911 18-19 21 23-25 27 31-34 39 41- 
42 46-48 52 54-58 62-69 71-72 74-75 78- 
80 82 89-90 93 99 110 115-121 123-124 
128-133 135 138 141 143-146 149 152 
156 159-161 163 167-168 176 179 181 
186-187 189-190 194 198 200 203-204 
209 211-212 218-219 226 232-236 240 
244 247 251 253-255 258-259 263-264 
269 271 274 278-279 282-283 285 288- 
290 294-295 297 301-306 311 313-314 
317 320-321 325 328 330-331 335 337 
342 344 348 350-351 353-354 358-359 
361-365 368 371-372 375 388-389 394- 
395 397-401 403 405 407 409-412 421 
425-427 432 437 442 448-450 452 457 
460-461 468-471 474 476 479-482 484 
492-494 496-498 500 506-510 516-517 
520-521 524 529-530 532 537 540-544 
551 553-554 558 560-565 569 577-578 
580-583 586-587 589 592 596-597 602- 
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603 606-608 610-624 626-628 630-631 
634-635 641-643 654 657-658 661 663- 
665 669 672 677 679 684-689 691 696- 
697 699 708 711 713 715 717 721-724 
728 730 738-740 747-749 755 761 765 
767-769 771 774-779 782 784 789 791- 
792 794-795 797 807-808 811-815 817- 
818 822 824 828 830 832 834 839-840 
842 845 848 856 859 862 864 867 871 
875-877 887 891 893-894 896-898 903 
906-911 913-916 921 923 925 927-928 
930 932 935-936 939 943-944 947 949- 
950 953 958-959 961-963 965 967 972- 
973 982 


leukocyte 


Clontech 


LUC003 


1 41 82 106 119 123-124 160 177 184 201 
212 221 228 271 279 285 295 321 325 
372 394 411-412 443 468-470 530 532 
537 551 569 580-581 613 619 623 626- 
627 642 655 697 761 767 769 775 789 
809 867 887 923 928 950 


melanoma 
from cell line 
ATCC#CRL 
1424 


Clontech 


MEL004 


3 25 55-56 67 71 78 109 121 129 146 167 
172-173 176 200 209 212 258-259 263 
278 297 301 306 312 335 338 340 352 
361-362 367 388 395 402 410 418-419 
429 437 454 464-465 481 496 500 503 
507 524 532 539 560-562 581-582 587 
589 599 612-613 617-621 623 643 657 
663-664 672 715 724 748 752 761 767- 
768 770 785-786 789 835 848 877 887 ': 
896 916 919-920 947 967 978-980 


mammaty 
gland 


Invitrogen 


MMGODl 


1 14 19 21 28-29 31-37 47 49-51 55 57 
63-67 69 71-72 75-78 92 108-109 111 116 
121 123-124 126 128 130-133 135 143- 
144 148-150 156 159 164 168 172 177- 
179 184 186-187 190 194 200-204209 
212 217 226 230 232-236 241 244 246- 
247 252 255 258-259 263 268 270 275 
279-283 285 290 292-293 301 304-305 
311 313-314 317 320 322-323 326-327 
330 332 338 342-344 348-349 354 360 
363 367 371 374 380 382-383 385 3 88 
394-395 398 401-403 407 409 41 1-412 
418-420 426-427 430 435 437 442 449- 
453 459 461 465-468 470 474 477-478 
480 483 485 488 498 500 503-504 507 
515 519 522 524 529-532 538-541 544 
547 555 560 563 565 569 573-574 579- 

ZOA con cork eex^^ rr\^ /Tit 

joU 5oz 384 587-589 593 597 601-610 
612-613 615-618 620-622 624 634 636- 
637 639 642-644 646-647 650^657 663- 
564 674 676 679 688-689 691 693 696 
701-703 713 715 717 728 730 732 738- 
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739 741-743 745 749 751 753 763 767 
769 772-773 785-786 793 796-797 812 
821-824 830-833 837 848 856 859 861 
864 868-870 876-877 887 891 893-894 
898 903-904 907-911 913-918 921 923 
925-926 930-931 936 942 949-950 958 
961 966-967 969 972-973 


induced neuron 
cells 


Strategene 


NTDOOl 


9 65 82 92 106 113 142 146 156 172 176 
191 208 221 258 277 328 333 346 361- 
362 371-372 375 388 410 414 418-419 
440 471 484 495 516 524 529-530 592 
610 628 642 650 745 748 752 761 793 
818 848 851 897 


retinoid acid 
induced neuron 
cells 


Strategene 


NTROOl 


19 87 184 305 385 440 474 626-627 643 . 
748 799 834 977 


neuronal cells 


Strategene 


NTUOOl 


19 33-34 42 70 82 87 109 115 126 146 
172 185 188 194 212 255 269 274 283 
312 317 329 340 361-362 367 379 394 
399 401 410 420 426-427 474 479 507 
530 579 582-583 610 617-618 636 643 
658 732 740 765 769 784 791 793 799 
802-803 818 842 851 864 897 907 932 


pituitary gland 


Clontech 


PIT004 


3 19 123-124 194 255 354 358 373-374 
377 426-427 462 492-494 635 785-786 
793 893-894 


placenta 


Clontech 


PLA003 


138 176 574 896 972 


prostate 


^Clontech, . . 


PRTQPJ., ; ;:. : 


.3 §.1.6.57.65 75 83:108 130-134 138 141 
i^l^ 149=150 f59 182 186-187 190 203 
209 234-235 276 283 322 413 415 442 
449-450 453 480 484 490 499-500 503 
505-506 523 5S7 543 564 583 602-603 
61 1 619 623 643 650 697 71 1 729 761 
765 770 776-778 784 789 819 822 831 
839 862 866 887 904 907 921 935 962- 
963 967 973 


rectum 


Invitrogen 


RECOOl 


19 30 33-34 66 108-109 123-124 126 129- 
131 143 149 151 156 164 190 201 240 
247 250 263 268 274 279 287 295 298- 
299 310 314 332 341 354 384 394 401 
420 425 442 446 459 483 485 520-521 
532 545 559 580-581 584 592 602-607 
610 612 615 619 634 637 646 655 664 
683-684 741 769 793 822 870 908-911 
914-916 934 937-938 942 967 973 982 


salivary gland 


Clontech 


SALOOl 


16 68 74 84 121 123-124 156 172 190 203 
209 232 248 254 269 292 294 363 377 
395 398 400 402 405-406 410 430 442 
459 462 474 483 485 563-564 579 587- 
588 599 602-603 643 658 699 728 730 
737 741 748 794 822 867 876 897 903 
981 
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salivary gland 


Clontech 


SALs03 


217254 270 388 610 


skin fibroblast 


ATCC 


SFBOOl 


517 949 


skin fibroblast 


ATCC 


SFB002 


269 688 


skin fibroblast 


ATCC 


SFB003 


3 203 897 907 


small intestine 


Clontech 


SINOOl 


3-4 47 57 68-69 92 99 125-126 130-131 
135 149 151-152 156 159 185 204 241 
246 291-292 318-319 338 343 348 363 
373 375 382 388-389 392-394 397 400 
437 466-467 471 484 500 517 520-521 
525 547 560 580-581 588 599 602-603 
612 624 643 711 731 733-734 757 761 
769 774-775 794 824 864 904 906 910- 
911 913 948 953 959 976 984 


SKeletal muscle 


Clontech 


SKMOOl 


15 75 135 146 172 190 218 267 282 308 
410 426-427 474 505 588 620 623 658 
692 713 737 779 790 862 874 878 887 
952 962-963 


skeletal muscle 


Clontech 


SKMs04 


215 


spinal cord 


Clontech 

r ?■ - ... 


SPCOOl 


14 20-21 25 28-29 31 39 46 48 59 78 83- 
84 91-92 103 112-113 135 160 168 172 
176 188 190 205 209 229 232 258 285 
301 308 312-314 321 323 329 346 374 
377 380 383 388 394 398 406 409-410 
431 449-450 453 455 466-467 470-471 
484-486 488 495 497 500 503 508-509 
524 537 539558 581 586 604-605 611 
619 623 630-631 633 656663 711 715 
729 736 740-741 761 767 769 776-778 
780 818 822 831 835-836 840 843 859 
861 871 875 887-888 897 906-907 913 
919-920 928 931 953 958 


adult spleen 


Clontech 


SPLcOl 


3 6 12-13 66 130-131 178 365 403 431 
461 558 610 715 797 809 876 947 967 


stomach 


Clontech 


STOOOl 


35 114 130-131 144 155 176 189206-207 
249 260-262 336 382 398 425 431 453 
461 483 496 500 527 530 580 642 657 
663 669 748 765 768 802-803 839 891 
942 981 


thalamus 


Clontech 


THA002 


30-3248 66 109 127 130-131 135 142 
145 156-158 168 172 174 185 199 224- 
225 233 246 277 282 286 293 322 332 
334 346 374 384 400 402 420 424 435- 
437 446 466-467 485 503 506 527 542 
549 572 612 615 622 624 633 643-644 
658 676 736 790 794 824 831 835 896 
907 950 969 


thymus 


Clonetech 


THMOOl 


10 16 20 28-29 32 37 41 52 57 66-67 74- 
75 110 118 121 129-131 141 151 159-160 
208 21 1 218 247 269 289 295 297 320 
325 354 358 365 367 372 378 388-389 
395 398 411-412 420 423 435 452 500 
508-509 517 524 532 537 551 558 560 
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569 577-578 582 586 598 608 61 1 622 
643 684 715 721-723 728 740 766 772- 
773 795 834 837 849 864 885 900 921 
946 948 958 962-963 965 972-973 982 


thymus 


Clontech 


THMc02 


1 3 9-11 16 21 27 32-34 38-39 51 55-57 
66 72 74 77-78 80 82 89-90 101 1 12 1 15 
118-119 121 123-124 126 138 144 152 
159 168 174 176 178 186-188 197 200 
208 212-214 217 225 233 243-244 246 
254 256-262 279 282 285 288-289 296- 
297 313-314 322 334 343 354-355 358- 
359 363-364 367-368 372-373 382 387- 
389 395 400 402 41 1 414 426-427 437 
440 442 449-450 454 457 462 464 469 
474 479 481 485 490-491 506 508-509 
511 517 522 526 528 532 542 551 554 
561-562 564 566-570 580-582 585 589 
597 599-600 602-608 611 613-614 619- 
621 625 628 630-631 644 646 655 669 
672 677 ^84 686-693 697 713 717 720 . 
728 740 746 749 760-762 767 771 775 
794 797 804 808 811 816 818-819 837 
840 859 880 883 887-888 896-897 903 
908-911 913 916 924 936 947-948 950 
962-963 965 967 970 


thyroid gland 


Clontech 


THROOl 


3 8-9 14-15 19-22 28-29 39 41 55-56 66 

69 71-72 78-79 97 104-105 109 113 115 
119 121 123-124 130-1.33 135 138 143- 
: 144 146 148 151-152 156 159-163 165 : 
168 172 174 177 183-184 196 199-200 
203 209 21 1 215-218 228-229 232-236 
244 254-255 258 273 282 290 292 294 
297 303-306 308 311 317-318 322-323 
325-326 334-335 340 342 348 354 358 
373 377 381-382 387 394 398 401-402 
405-406 409-412 416 422 425-427 429- 
431 440 449-453 462 466-468 474 478- 
479 481-484 490 492-496 500-501 505- 
506 517-518 522-525 532 537 540-541 
545 551 558 560 563-564 580 583 587- 
589 593 597 599 606-607 610 617-621 
625-628 633 635 641-643 658-659 664- 
669 674 682 686 688-691 696 699 715 
724 730 740 742-743 747 750 752 759 
761 765-766 768-769 779 789 796 802- 
803 813 818-819 822 831 837 843 845 
848-849 862 864 868-869 871 874 876- 
877 887 893-894 896-897 907-909 912 
919-921 923 925 928 936 940-942 944 
946-947 950 953 955 958-959 962^963 
967 969 973 981 


trachea 


Clontech 


TRCOOl 


33-34 55-56 69 74 163 172 190 209 212 
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267 270 297 305 314 352 413 426-427 
466-467 500 502 504 580 586 610 613 
633 642 688 691 71 1 724 738-739 774 
782 816 820 839 848 862 868-869 914- 
915 928 968 


uterus 


Clontech 


UTROOl 


4 918 37 63-64 74108114-115 130-131 " 
160 166 179 184 190 209 233 249 269 
285 301 314 327 337 348 384 394 399- 
400 403 406 41 !■ dO^ 4H A.'XA Ait a An 

462 474 485 490 508-509 526 532 579 
617-619 636 642-643 672 761 769 793 
837 849 864 887 903 906 928 934 947 
967 



TABLE 2 



SEQ 

m 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMI'IH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1 
2 


L06I75 
Y70775 


Homo sapiens 
Homo sapiens 


occurs in MHC class I region; ORF 


308 


98 


3 


X15187 


Homo sapiens 


precursor polypeptide (AA -2 1 to 
782) 


3094 
4112 


98 
100 


4 


AF110640 


Homo sapiens 


orphan seven-transmembrane 
receptor 


344 


100 


5 
6 


G03798 
W85607 


Homo sapiens 
Homo sapiens 


Human secreted protein, SEQ ID 
NO: 7879. 

Secreted protein clone da228 6. 


158 


72 


7 

8 

9 


Y30162 

Y15227 
Y28817 


Homo sapiens 

Homo sapiens 
Homo sapiens 


Human dorsal root receptor 4 

hDRR4. : 

Leul 

pt326 4 secreted protein. 


1477 
884 

391 
3338 


100 
88 

100 
100 


10 
11 
12 


X92106 
Y15228 
U27838 


Homo sapiens 
Homo sapiens 
Mus musculus 


bleomycin hydrolase 
Leu2 

glycosyl-phosphatidyl-inositol- 
anchored protein homolog 


2445 
445 
432 


100 
100 
34 


13 


U27838 


Mus musculus 


glycosyl-phosphatidyl-inositol- 
anchored protein homolog 


320 


27 


14 


Y71062 


Homo sapiens 


Human membrane transport protein, 
MTRP-7. 


2323 


99 


15 
16 


U96781 
M16653 


Homo sapiens 
Homo sapiens 


Ca2+ ATPase of fast-twitch skeletal 
muscle sacroplasmic reticulum, adult 
isoform 


5145 


100 


17 


Y13398 


Homo sapiens 


pancreatic elastase IIB zymogen 
Amino acid sequence of protein 
PR0346. 


1435 
1749 


99 
99 


18 


Y02283 


Homo sapiens 


Secreted protein clone br342_l 1 
polypeptide sequence. 


1399 


99 


19 


Y53030 


Homo sapiens 


Human secreted protein clone d24 1 
protein sequence SEQ ID NO:66. 


1371 


100 


20 
21 


AL031320 
B01384 


Homo sapiens 
Homo sapiens 


dJ20N2.5 (novel protein similar to 
fucosidase, alpha-L-1, tissue (EC 
3.2.1.51, alpha-l-fiicosidase 
focohydrolase)) 


2597 


99 


22 " 


Y68778 


Homo sapiens 


Neuron-associated protein. 
Amino acid sequence of a human 
phosphorylation effector PHSP-10. 


1876 
2470 


100 
100 
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SEQ 

n> 

NO: 


ACCESSION 
NUMBER 


SPECIES 




SMITH- 
WATERMAN 
SCORE 


/o 

IDENTITY 


23 


Y55935 


Homo sapiens 


Human KHS2 protein. 


4781 


99 


24 


Y55935 


Homo sapiens 


Human KHS2 protein. 


2807 


100 


25 


AC024792 


Caenorhabditis 
elegans 


contains similarity to TR:O95029 


463 


31 


26 


Y07972 


787 


Human secreted protein fragment 


1540 


100 


27 


X97630 


Homo sapiens 


serine/threonine protein kinase 


3781 


98 


28 


AF150755 


Mus musculus 


microtubule-actin crosslinking factor 


3514 


68 


29 


AF150755 


Mus musculus 


microtubule-actin crosslinking factor 


3725 


70 


30 


Z38011 


Mus musculus 


DMR-N9 


2988 


86 


31 


AJ000522 


Homo sapiens 


axonemal dynein heavy chain 


6058 


99 


32 


AF037256 


Mus musculus 


ES2 protein 


2260 


91 


33 


S62140 


Homo sapi^ 


TLS=nuclear RNA-binding protein 


2917 


100 


34 


S62140 


Homo sapiens 


TLS=nuclear'RNA-binding protein 


2890 


98 


36 


AB038237 


Homo sapiens 


G protein-coupled receptor C5L2 


1767 


100 


. 37 


D79994 


Homo sapiens 


similar to ankyrin of Chromatium 
vinosum. 


6089 


99 


38 


X63380 


Homo sapiens 


serum response factor-related protein 


1966 


99 


39 


AL022072 


Schizosacchar 
omyces pombe 


lipoic acid synthetase 


1067 


61 


40 


J03930 


Homo sapiens 


alkalme phosphatase 


2751 


100 


41 


AF132968 


Homo sapiens 


CGI-34 protein 


1088 


98 


42 


ALl 17637 


Homo sapiens 


hypothetical protein 


2208 


100 


43 


AL021393 


Homo sapiens 


bK747E2. 1 (novel protein) 


1526 


100 


44 


X68011 


Homo sapiens 


ZNF81 


1886 


100 


45 


AC002464 


Homo sapiens 


organic cation transporter; 50% 
similarity to JC4884 (PID:g2143892) 


2423 


100 


46 


W78245 


Homo sapiens 


Fragment of human secreted protein 
encoded by gene 19. 


1949 


100 


47 


Y41765 


Homo sapiens 


Hmnan PRO 1083 protein sequence. 


3604 


100 


48 


AF097330 


Homo sapiens 


HI chloride-channel; p64Hl;CLIC4^ 


1305 


99 


'i50 


U09413 


Homo sapiens 


2iiic7fingerprotein^^^^^ -^y .r 


1361 


57 


51 


AF061812 


Homo sapiens 


keratm 16 ^ 


2374 


100 


52 


W63681 


Homo sapiens 


Human secreted protein 1. 


1326 


99 


53 


AB035303 


Homo sapiens 


cadherin-10 


4094 


100 


54 


A12022 


synthetic 
construct 


MRP-8 


485 


100 


55 


AL121897 


Homo sapiens 


bA392M18.3 (KIAA0180) 


1867 


100 


56 


Y73330 


Homo sapiens 


BTBM clone 397663 protein 
sequence. 


818 


96 


57 


AF151018 


Homo sapiens 


HSPC184 


955 


100 


58 


AF125042 


Homo sapiens 


bisphosphate 3 '-nucleotidase 


1586 


100 


59 


AF118670 


Homo sapiens 


orphan G protein-coupled receptor 


1971 


100 


60 


X04494 


Homo sapiens 


precursor polypeptide 


1903 


100 


61 


AF20SS65 


Homo sapiens 


EDRF 


528 


100 


62 


D15057 


Homo sapiens 


DAD-1 


567 


100 


. 63 


AF260665 


Homo sapiens 


histone acetyltransferase 


1510 


100 


64 


AF260665 


Homo sapiens 


histone acetyltransferase 


1429 


96 


65 


AJ277145 


Homo sapiens 


ras-related small GTPase RAB18 


1073 


100 


66 


Y94950 


Homo sapiens 


Human secreted protein clone 
dhl073 12 protein sequence SEQ K) 
NO:106T 


348 


100 


67 


Y82744 


Homo sapiens 


DNA replication and repair 
associated protein (DRASP). 


1028 


100 


68 


Y44486 


Homo sapiens 


Human GPRW receptor polypeptide. 


1721 


100 


69 


AL031228 


Homo sapiens 


dJ1033B102 (WD40 protein BING4 
(similar to S. cerevisiae YER082C, 
M. sexta MNGIO and C. elegans 
F28D1.1) 


3196 


100 
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SEQ 
ID 

NO: 
70 
71 


ACCESSION 
NUMBER 

AJ276'? 1 
Y18314 


SPECIES 

Homo sapiens 
Homo sapiens 


DESCRIPTION 

zinc finger protein 304 


1 SMFTH- 
WATERMAN 
SCORE 
1751 


% 

EDENTTTY 

52 


72 
74 

75 


AFl 57028 

Y7 10519 

AF225420 


Homo sapiens 
Homo sapiens 

Homo sapiens 


paraplegin-like protein 

protein phosphatase methylesterase-1 

Hmnan B-aggressive lymphoma 

(BAL) protein. 

AD025 


4146 
2017 
1765 


99 
100 
99 

1 nn 


76 
77 


X95235 
AF108420 


Homo sapiens 

Takifiigu 

rubripes 


transcription factor AP2 

1-aminocyclopropane-carboxilate 
synthase 


217 
733 


100 
56 


78 
79 


G01349 
AL117635 


Homo sapiens 
Homo sapiens 


Human secreted protein, SEQ ID 
NO: 5430. 
hypothetical protein 


650 


99 


81 


.Z85986 


Homo sapiens 


dJ108Kn.3 (similar to yeast 
suppressor protein SRP40) 


922 
865 


99 
77 


~8r~ 


AF183414 


Homo sapiens 


• hemin-sensitive mitiation factor 2a 
kinase 


3231 


99 


83 

84 
85 
87 
88 
89 


G01143 

U03985 
YI7791 

AF263538 
Y19757 

AF161493 


Homo sapiens 

Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


Human secreted protein, SEQ ID 
NO: 5224. 

N-ethyhnaleimide-sensitive factor 
VAX2 protem 

growth differentiation factor 3 
SEQ ID NO 475 from W09922243. 
HSPC144 


3744 
1496 
1944 
1361 


yo 

99 
100 
99 
100 


90 
91 

' 94 


B25780 

AT 1 /Zoj4 

AL390114 


Homo sapiens 
787 

Mus musculus 
Homo sapiens 
Leishmania 
major 


HSFC144 

Human secreted protein SEQ ID 
Meis3 

cardiotrophin-like cytokine CLC 
extremely cysteine/valine rich 
protein 


1185 
856 
647 
1007 
1197 
223 


100 

100 

41 

89 

98 

29 


95 


AB016.886._ 


Arabidopsis 

/ffiafiama n f; '. - 


contains similarity to adenylate 
'ncinase'-genejd:MCA23.1^ 


287 


38 


70 


A pnnc^oc — 


TIdmosapieifs " 


F22162_l ] 


1855 


96 


Q7 

98 


ozuyy / 
AJ006692 


Homo salens 
Homo sapiens 


Human nucleic acid-binding protein, 

NuABP-l. 1 
ultra high sulfer keratin j 


3836 


99 




At 1 /zzo4 


Homo sapiens 


Traf2 and NCK interactmg kinase, 
splice variant 1 | 


507 
6942 


70 
99 


100 


LI 1239 


Homo sapiens 


homeobox protein 


717 


100 


Iv/l 


AUUU4oy0 


Homo sapiens 


similar to zinc finger proteins; 
similar to AACO 1 956 
(PID:g2843171) 


2154 


98 


102 


AC003682 


Homo sapiens 


R28830_2 


1287 


48 


103 


AF201839 


Rattus 
norvegicus 


dynamin Ilibb isofonn 


4270 


95 


104 


Y79510 


Homo sapiens 


Human carbohydrate-associated 
protem CRBAP-6. 


1394 


100 


105 

106 
108 


Y79510 

AL096748 


Homo sapiens 

Homo sapiens 
Homo sapiens 


Human carbohydrate-associated 
protem CRBAP-6. | 
hypothetical protein | 
Metallothionein 2 


1209 
1216 


90 
100 


109 
110 


AL034422 
AF191338 


Homo sapiens 
Homo sapiens 


dJ1141E15.2 (novel protein) 
anaphase-promotmg complex subunit 
4 1 


381 

433 
683 


100 
100 
100 


111 


AL021712 

1 


Arabidopsis 
thaliana 


putative protein 


1 Q< 


26 


112 

113 
114 


AF250138 ] 

AL109976 ] 
Y36151 


Homo sapiens 

] 

Homo sapiens < 
787 J 


small stress protein-like protein 
HSP22 

U794I6. 1 . 1 (novel protein) j 
4[uman secreted protein | 


1063 

4176 
668 


100 

99 
100 



129 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH* 
WATERMAN 
SCORE 


% 

IDENTITY 


115 


AFl 10399 


Homo sapiens 


elongation factor Ts 


1666 


100 


116 


AF210317 


Homo sapiens 


facilitative glucose transporter family 
member GLUT9 


2052 


99 


117 


Y73328 


Homo sapiens 


HTRM clone 082843 protein 
sequence. 


931 


100 


118 


X04085 


Homo sapiens 


catalase 


2846 


100 


119 


AF 1477 17 


Homo sapiens 


ubiquitin C-terminal hydrolase 
UCH37 


1695 


100 


120 


X73882 


Homo sapiens 


microtubule associated protein 


3801 


99 


121 


AC004882 


Homo sapiens 


similar to CAA16821 
(PID:g3255952) 


3223 


100 


122 


M93311 


Homo sapiens 


metallothionein-III 


421 


100 


123 


G03827 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 7908. 


557 


94 


124 


G03827 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 7908. 


222 


53 


125 


AF232009 


Homo sapiens 


peroxisomal trans 2-enoyl CoA 
reductase 


1565 


99 


126 


AB004906 


Ipomoea 
purpurea 


transposase 


146 


20 


127 


M60165 


Homo sapiens 


guanine nucleotide-binding 
regulatory protein 2 


1832 ' 


99 


128 


Y103i9 


Homo sapiens 


carnitine carrier 


1592 


100 


129 


U75467 


Drosophila 
melanogaster 


Atu 


937 


36 . 


130 


Z21507 


Homo sapiens 


human elongation factor- 1 -delta 


494 


87 ■ 


131 


Z21507 


Homo sapiens 


human elongation factor- 1 -delta 


938 


100 . 


. 132 


Y58633 


Homo sapiens 


Protein regulating gene expression 
PRGE-26. 


6745 


100 


133 


Y58633 


Homo sapiens 


Protein regulating gene expression 
,PRGE-26. ... 


4818 


95 , 


■ 134 


• M13692 


Homo sapiens 


alpha- 1 acid glycoprotein precursor 


1064 


99 . i 


135 


U72970 


Sus scrofe 


calcium/calmodulin-dependent 
protein kinase n isoform gamma-B 


2723 


99 


136 


G032I3 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 7294. 


450 


100 


137 


AC005102 


Homo sapiens 


small inducible cytokine subfamily A 
member 24 


627 


99 


138 


AF155648 


Homo sapiens 


putative zinc finger protein 


5855 


92 


139 


AFl 44638 


Homo sapiens 


sphingosine-1 -phosphate lyase 


2977 


100 


140 


AF152318 


Homo sapiens 


protocadherin gamma Al 


4778 


100 


141 


B08517 


Homo sapiens 


Amino acid sequence of a beta- 
tubulin antigen. 


5841 


100 


142 


X56667 


Homo sapiens 


cahetinin 


1410 


99 


143 


X92763 


Homo sapiens 


tafazzins 


1605 


100 


144 


Y95293 


Homo sapiens 


Human GEF containmg NEK-like 
kinase substrate sGNK. 


4092 


99 


145 


AF226046 


Homo sapiens 


GK003 


1198 


100 


146 


M22877 


Homo sapiens 


cytochrome c 


. 554 


98 


147 


AJ272212 


Homo sapiens 


protein serine kinase 


2196 


100 


148 


AB026491 


Homo sapi^ 


PICKl 


2114 


98 


149 


AB018580 


Homo sapi^ 


hluPGFS 


1699 


100 


150 


X91868 


Homo sapiens 


sixl 


1509 


100 


let 

151 


AF266505 


Mus musculus 


pseudouridine synthase 3 


2135 


84 


152 


U29170 


Drosophila 
melanogaster 


ANON-23D 


883 


43 


153 


G04075 


Homo sapiens 


Human secreted protem, SEQ ID 
NO: 8156. 


567 


99 


154 


AY009128 


Homo sapiens 


ISCU2 


138 


100 



130 



wo 01/57190 
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SEQ 

NO: 
155 



ACCESSION 
NUMBER 



SPECIES 



DESCRIPTION 



SMITH- 
WATERMAN 
SCORE 



% 

IDENTITY 



Homo sapiens | alpha-I,4-N- 

acetylglucosaminyltransferase 



1842 



100 



Homo sapiens 



157 
158 



AF159297 



AL133325 



Zea mays 



candidate tumor suppressor p33 
INGl homolog 



Homo sapiens 



extensin-like protein 



159 
160 



AF073298 
AC004858 



Homo sapiens 



dJ984P4.3 (Homeobox protein 
NKX2B) 



small EDRK-rich factor 2 



1294 



238 



1437 



294 



99 

"2^ 



100 



100 



Homo sapiens 



Ul small ribonucleoprotein ISNRP 
homolog; match to PID:g4050087 



4032 



100 



161 
162 



ABO 12109 
AL162751 



Homo sapiens 
Arabidopsis 



APCIO 



990 



100 



163 



AJ005698 



164 



AFl 17646 



165 



AC004002 



166 



Ml 0942 



167 



AF126484 
AF161518 



thaliana 



putative protein 



Homo sapiens 



Homo sapiens 



poly(A)-specific ribonuclease 



Homo sapiens 



longCBL-3 protein 



Homo s^iens 



similar to ciliary dynein beta heavy 
chain; 78% Similarity to P23098 
(PID.gl 18965) 



Homo sapiens 



human metallothionein-Ie 



CARD4 



194 



3351 



2547 



5065 



381 



4961 



32 



100 



99 



100 



100 



100 



169 



170 



171 



172 



173 



174 



175 



M64983 
M64983 



Homo sapiens 



Homo sapiens 



M58514 



Homo sapiens 



AF078845 



AC004774 



Gallus gallus 



Homo sapiens 



Z98974 



X56203 



177 
178 



W74726 



AJ222967 
AC024796 



Homo sapiens 



Schizosacchar 
omyces pombe 



Plasmodium 
falciparum 



Homo sapiens 
Homo sapiens 



Caenorhabditis 
elegans 



HSPC169 



fibrinogen beta chain 



fibrinogen beta chain 
fibrinogen beta chain 
16.7Kd protein 



Dlx-6 



putative vacuolar protein sorting- 
associated protein 



liver stage antigen 



Human seCTeted protein 1^49_3'. 



cystihosin 



contains similarity to TR;076167 



1604 



2482 



2679 



1059 



786 



923 



185 



283 



IS79 
1920 



221 



100 



100 



^00 
78 



100 



100 



31 



23 



100 



100 



27 



179 
180 



AF151803 
G02694 



Homo sapiens 



Homo sapiens 
Homo sapiens 



Membrane-bound protein PRQ276. 



CGI-45 protein 



1370 



215 



100 



28 



181 



182 



Y17292 



Homo sapiens 



Human secreted protein, SEQ ID 
NO: 6775. 



183 



AF234765 



184 



AF151855 



185 
186 



AF289664 



AL022238 



187 



AL022238 



188 



189 



190 



X83543 



AF059569 



191 



192 

i9r 



M18135 



AF242194 



D30689 
Y44984 



Rattus 
norvegicus 



Homo sapiens 
Mus musculus 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Rattus 
norvegicus 



Drosophila 
melanogaster 



Bacillus 

subtilis 

Homo sapiens 



Human cell death preventing kinase 
(DPK- 1) protein sequence. 



serine-arginine-rich splicing 
regulatory protein SRRP86 



CGI>97 protein 
CYLN2 



dJ1042K10.2 (supported by 
GENSCAN, FGENES and 
GENEWISE) 



dJ1042K10.2 (supported by 
GENSCAN, FGENES and 
GENEWISE) 



APXL 

actin binding protein MAYVEN 



smooth-muscle alpha tropomyosin 



brakeless-B 



subunit of nitrite reductase 



283 



2676 
1214 



4673 



4059 



2332 



8513 



3106 



1306 



147 



100 



100 



27 



96 



90 



100 



100 



99 



99 



95 



52 



29 



Hxunan epidermal protein- 1. 



538 



97 



131 
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ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMI'IH- 
WATERMAN 
S€X)R£ 


% 

IDENTITY 


194 


B25679 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 15 SEQ ID NO:68. 


760 


100 


195 


AB020315 


787 


homologiie of mouse dkk-1 geneiAcc 


1466 


100 


196 


U35730 


Mus musculus 


jerky . 


2021 


75 


197 


AL136450 


Homo sapiens 


dJ510O2Ll (novel protein) 


632 


• 100 


198 


X56203 


Plasmodium 
felciparum 


liver stage antigen 


512 


24 


199 


Y70775 


Homo sapiens 


Follistatin-Felated protein zfsta. 


2027 


63 


200 


X87237 


Homo sapiens 


a-glucosidase I 


4447 


99 


201 


AF101078 


Caenorhabditis 
elegans 


CLU-1 


1393 


46 


202 


X04571 


Homo sapiens 


precursor polypeptide (AA -22 to 
1185) 


6611 


100 


203 


X00474 


Homo sapiens 


pS2 precursor 


466 


100 


204 


AB029333 


Halocynthia 
roretzd 


HrPET-1 


974 


54 


205 


AF146019 


Homo sapiens 


hepatocellular carcinoma antigen 
gene 520 


998 


100 


206 


AF071002 


Homo sapiens 


minK-related peptide 1; MiRPl 


632 


100 


207 


AB038162 


Homo sapiens 


trefoil factor 2 


744 


100 


208 


U30521 


Homo sapiens 


P311HUM 


363 


100 


209 


AB000911 


Sus scrofa 


ribosomal protein 


782 


100 


210 


AB021227 


Homo sapiens 


membrane-type-5 matrix 
metalloproteinase 


3545 


100 


211 


AFl 80920 


Homo sapiens 


cycliri L ania-6a 


2722 


100 


212 


AF105365 


Homo s^iens 


K-Cl cotransporter KCC4 


5624 


100 


213 


U29244 


Caenoiiiabditis 
elegans 


similar to human (TRE) transforming 
protein (PIR:S22 157) 


602 


32 


214 


AL033538 


Homo sapiens 


dJ477H23,l (novel protein) 


3195 


100 


215 


X52011 


Homo sapiens 


muscle determination factor 


1262 


100 


216 


AF083248; 


Homo sapiens 


ribpjomal prqtem,L26 hpmolog . , 


■ ^.^ 739 


100 


217 


AF006751 


Homo sapiens 


ES/BO 


4793 


99 


218 


AB007859 


Homo sapiens 


KIAA0399 protein 


3559 


99 


219 


AK026291 


Homo sapiens 


unnamed protein product 


826 


100 


221 


Y84045 


Homo sapiens 


Splice variant of cancer associated 
polypeptide CHl-9al 1-2. 


5851 


97 


222 


Z67996 


Homo sapiens 


tenascin-R (restrictin) 


7186 


100. 


223 


AF134802 


Homo sapiens 


cofilin isoform 1 


846 


100 


224 


Y17711 


Homo sapiens 


atopy related autoantigen CALC 


1611 


99 


225 


AF190051 


Gallus gallus 


hepatocyte nuclear factor la 
dimerization cofactor isoform 


443 


81 


226 


AK026256 


Homo sapiens 


unnamed protein product 


866 


98 


227 


Z69368 


Schizosacchar 
oinyces pombe 


•nuf2-like coiled-coil protein 


230 


25 


228 


AF275948 


Homo sapiens 


ABCAl 


11763 


99 


229 


AF161384 


Homo sapiens 


HSPC266 ■ 


2006 


98 


230 


Y16270 


Homo sapiens 


paralemin 


1951 


100 


231 


AJ245599 


Homo sapiens 


putative secreted ligand 


2379 


99 


232 


W88499 


Homo sapiens 


Human stomach carcinoma clone 
HP10412-encoded protein. 


1545 


99 


233 


AF096286 


Mus musculus 


pecanex 1 


3623 


93 


234 


V64619_cd 
1 


Homo sapiens 


30-NOV-1990 Human HEl cDNA. 


796 


100 


235 


V64619 cd 
1 


Homo sapiens 


30-NOV-1990 Human HEl cDNA. 


470 


98 


236 


AF227258 


Bos taurus 


RPGR-mteracting protein- 1 


1262 


38 


237 


AJ132445 


Homo sapiens 


claudin-14 


1181 


100 


238 


AL034562 


Homo sapiens 


dJ684024.2 (prodynorphin (Beta- 


1330 


100 



132 
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SEQ 
ID 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH. 
WATERMAN 
SCORE 


% 

IDENTTrV 


239 


AF262027 


Homo sapiens 


Neoendorphin-Dynoiphin precursor, 
Proenkephalin B precursor)) 
eIF-5A2 










Arabidopsis 
thaliana 


■ putative protein 


808 
194 


100 
33 


241 


AC002394 


Homo salens 


Gene product with similarity to 
dynein beta subunit 


1542 


51 


242 


AJ271361 


Takifugu 
rubripes 


FRANK2 protein 


303 


30 


243 
244 


AL021918 
AF190167 


Homo sapiens 
Homo sapiens 


b34I8.1 (Kruppel related Zinc Finger 
protein 184) 

membrane associated protein SLP-2 


1476 
1736 


48 
99 


OA's 

246 


YlOoOl 
AL121771 


Homo sapiens 
Homo sapiens 


ankyrm-like protein 
dJ548G19, LI (novel protein 

lOrtnolo? of moilQP Tinr fino-pr 

protein ZFP64) (translation of cDNA 
NT2RP3001398 (Em:AK001596)) 
(isoform 1)) 


5877 
3628 


100 
100 


247 
248 


L25314 
X63745 


Drosophila 
melanogaster 
Homo sapiens 


aclm-related protein 
KDEL receptor 


984 


47 


249 


AFl 12208 


Homo s^iens 


13kDa differentiation-associated 
protein 


\.\}yD 

816 


100 
100 


250 


AP001707 


Homo sapiens 


human gene for cJaudin-8, Accession 
No. AJ250711 


1 179 


1 AA 


i 


ATI "^Al 1^ 

A T (VX 1 1 C< 
/VL^UJi loo 


Homo sapiens 
Homo sapiens 


dJ304B14.1 (novel protein) 


778 


100 


9S1 
254 


AL049843 


' Homo sapiens 
Homo sapiens 


bK984Gl.l (supported by FGENES) 
Human secreted protein clone BL205 
14 protein. 

dJ392M17.3 (KIAA0349 protein) 


532 
639 


100 
100 


95S 




.Homo sapiens.. 


TQLLIP protein 


6741 

1424 


99 

, 99 


256 


Y9487> 


Homo sapiens 


Himian protein clone HP02632. 


1 o /o 


1 AA 


257 


AF279865 


Homo sapiens 


kinesin-like protein GAKIN 


2903 


100 


9S8 


AT n94/10C 


Homo sapiens 


dJ417M14.1 (novel protein) 


589 


100 


259 


R66278 


Homo sapiens 


Therapeutic polypeptide from 
glioblastoma cell line. 


830 


100 


260 
261 


AF101784 
AF101784 


Homo sapiens 
Homo sapiens 


b-TRCP variant E3RS-IkappaB 
b-TRCP variant E3RS-IkanDaB 


3226 


99 


262 


AF101784 


Homo sapiens 


b-TRCP variant E3RS-IkappaB 


3149 


1 AA 

100 
99 




API 07 A 


Homo sapiens 


src homology 3 domain-containing 
protein HIP-55 


2257 


100 


964 
265 


X oOZOZ 

Y56966- 


Homo sapiens 
Homo sapiens 


Human secreted protein HAQAR23, 
SEQ ID NO: 177. 


766 


100 


266 


Y56966 


Homo sapiens 


Human SBPSAPL polypeptide. 
Human SBPSAPL polypeptide. 


2779 
1018 


100 
99 


267 


AJ300465 


Homo sapiens 


putative white family ATP-bindmg 
cassette transporter 


1557 


0^ 


268 
269 


AC004030 
X55954 


Homo sapiens 
Homo sapiens 


1^'21856 2 

HL23 ribosomal protein 


3579 


99 


270 
271 
272 


AB033921 
AF081886 
AF166492 


Mus musculus 
Homo sapiens 
Homo sapiens 


Ndrl related protein Ndr2 
EROl-like protein 
small GTPase RAB6B 


714 

1S5S 

1905 


100 

y4 
99 


273 
274 

275 


AL022238 
W88667 

X00129 ] 


Homo sapiens 
Homo sapiens 

Homo sapiens 


dJ1042K10.4 (novel protein) 
k_7w>M w«.wu j^iWLCiii cilcoQcQ Dy sene 
134 clone HAIBP89. 


1060 
2201 
1530 


100 
100 
99 


276 . 
277 


Z47500_cdl ] 
AB049188 J 


Homo sapiens 
Bquus caballus i 


precursor RBP 

ll-iVIAY-1998 Human RHOH gene 
sequence. 

ibiquitin C-terminal hydrolase 


1044 
1161 

1118 


97 
100 

96 



133 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


278 


AF270647 


Homo sapiens 


GTTl 


1564 


100 


279 


AF143956 


Mus musculus 


coronin-2 


2414 


94 


280 


R85151 


Homo sapiens 


Endothelial cell polypeptide. 


911 


92 


281 


R85151 


Homo sapiens 


Endothelial cell polypeptide. 


1031 


100 


282 


D83948 


Rattus 
norvegicus 


Sl-1 protein 


3975 


90 


283 


Y14768 


Homo sapiens 


I Kappa B-like protein 


2037 


100 


286 


AL031316 


Homo sapiens 


dJ28O10.3(HSDllBl 
(hydroxysteroid (1 1-beta) 
dehydrogenase 1) 


294 


100 


. 287 


D64109 


Homo sapiens 


tob family 


1773 


99 


288 


AB026043 


Homo sapiens 


MS4A7 


1230 


100 


289 


M61866 


Homo sapiens 


Krueppel-related DNA-binding 
protein 


209 


90 


290 


AJ001810 


Homo sapiens 


mRNA cleavage factor 1 25 kDa 
subunit 


1217 


100 


291 


Y99454 


Homo sapiens 


Human PRO 1605 (UNQ786) amino 
acid sequence SEQ ID NO:395. 


694 


100 


292 


Y44824 


Homo sapiens 


Himian molecule associated with cell 
proliferation, MACP-4: 


2370 


100 


293 


AJ276101 


Homo sapiens 


GPRC5B protein 


2099 


100 


294 


AF161406 


Homo sapiens 


HSPC288 


719 


100 


295 


Y58628 


Homo sapiens 


Protein regulating gene expression 
PRGE-2L 


1276 


100 


296 


U91561 


Rattus 
norvegicus 


pyridoxine 5'-phosphate oxidase 


1239 


87 


297 


L02956 


Xenopus 
laevis 


ribonucleoprotein 


1624 


83 


298 


AF226730 


Homo sapiens 


Cytl9 


1729 


99 


299 


AF226730 . 


Homo sapiens 


Cytl9 


906 


98 




-n 7Y54324 * ' 


Miiomo sapiens 


Amino acid sequence of a human • 
gastric cancer antigen protein. 


7.18 




301 


AF125533 


Homo sapiens 


NADH-cytochrome b5 reductase 
isoform 


1606 


100 


302 


Y32206 


Homo sapiens 


Human receptor molecule (REC) 
encoded by Incyte clone 2825826, 


1676 


98 


303 


AF247565 


Homo sapiens 


hepatocellular carcinoma associated 
ring finger protein 


525 


100 


304 


AF208844 


Homo sapiens 


BM-002 


428 


100 


305 


AC004983 


Homo sapiens 


similar to PID:g3877944 


1988 


100 


306 


AL132978 


Arabidopsis 
thaliana 


putative protem 


210 


25 


307 


Y10530 


Homo sapiens 


olfactory receptor 


1645 


100 


308 


AF180681 


Homo sapiens 


guanine nucleotide exchange factor 


3597 


100 


309 


AF111856 


Homo sapiens 


sodimn dependent phosphate 
transporter isoform NaPi-3b 


3591 


99 


310 


Y13583 


Homo s^iens 


G-protein coupled receptor 


2171 


100 


311 


Z73420 


Homo sapiens 


CB146D10.2 (mercaptopyruvate . 
sulfiirtransferase (EC 2.8,1.2)) 


1598 


100 


312 


X79535 


Homo sapiens 


beta tubulin 


2348 


100 


313 


AF070658 


Homo sapiens 


HSPC002 


861 


100 


314 


AF078866 


Homo sapiens 


SURF-4 


1395 


100 


317 


Z37986 


Homo sapiens 


pheny lalky lamine binding protein 


1258 


100 


320 


AB047892 


Macaca 
fascicularis 


hypothetical protein 


258 


82 


321 


Y25755 


Homo sapiens 


Human secreted protem encoded 
from gene 45. 


1440 


100 


322 


AB016531 


Homo sapiens 


PEX16 


1741 


100 


323 


AL391141 


Arabidopsis 


putative protein 


274 


49 
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SEQ 
ID 
NO: 

325 

326 
327 


ACCESSION 
NUMBER 

AF140501 
X96698 
AF152325 


SPECIES 

thaiiana 
Homo salens 
Homo sapiens 


DESCRIPTlUN 

DNA polymerase iota 
D1075-like 

proiocadherm gamma A5 


SMITH- 
WATERMAN 
SCORE 

3691 
1450 


% 

IDENTITY 

99 
96 


328 
329 
330 
331 


AF151803 
X74070 

AF171102 
W54040 


Homo sapiens 

Homo sapiens 
nvuiiti bdpiens 


CGI-45 protein 
iranscnption ractor tJ 1F3 
retinal degeneration B beta 
Hrnnan mterferon-inducible protein, 
HIFI. 


4769 
1970 
639 
1302 
484 


100 

100 

81 

95 

98 


332 


AF024617 




transcription-associated zinc ribbon 
protein 


691 


100 


333 


U19181 


Rattas 
norvegicus 


Rabin3 


2129 


90 


334 


G03877 


Homo sapiens 


Hmnan secreted protein, SEQ ID 
NO: 7958. 


621 


100 


335 

336 
337 


AL008582 

AFl 10774 
AB011414 


Homo sapiens 

Homo sapiens 
Homo sapiens 


■ bK223H9.2{ortholog of A. thaiiana 
F23F1.8) 

adrenal gland protein AD-001 
Kruppel-type zinc finger protein 


626 

647 


100 
ion 


338 
340 

341 


AF207600 
AC02057Q 

Y28576 


Homo sapiens 

/vTaomopsis 

thaiiana 

Homo sapiens 


ethanolamine kinase 
putative 

phosphoribosylformylglycinamidine 
synthase; 25509-29950 


1674 
129 
3283 


58 
100 
50 


342 


U32274 


Saccharomyce 
s cerevisiae 


iJecreted peptide clone pe503 1. 
Ydr386wp; CAI: 0,12 


944 
191 


100 
37 


343 


A01771 


synthetic 
construct 


vascular anticoagulating protein 


1661 


99 


344 

i 

345 


AF220052 
Y70400 ^ 


Homo sapiens 
Homo sapiens 


imcharacterized hematopoietic 
stem/progenitor cells protein 
MDS032 


1285 


inn 


346 
347 


Y50926 
AFl 83428 


Homo sapiens 
Homo sapiens 


Human cell-signallmg protein-2r 
Human fetal brain cDNA clone 
vcl6 1 derived protein. 
28.4 kDa protein 


754 
962 


100 
100 


348 


AC006069 


Arabidopsis 

lllallolla 


putative cleavage and 
polyadenylation specifity factor 


1329 
1383 


100 

55 


349 
350 


AL032631 
U70669 


Caenorhabditis 
elegans 


Y106G6H.8 

Fas-iigand associated factor 3 


194 


39 


351 


Y93468 


xxuiiiu odpiens 
Homo sapiejis 


Amino acid sequence of a potassium 
channel interactor protein. 


167 
1182 


23 
92 


352 
353 


AF005856 
AJ271684 


yakuba 

l-Tnirm csiniAnc 


anonzAj 

myeloid DAP12-associating lectin 


111 
1013 


45 
100 


354 
355 


AF099100 
U51730 


Homo sapiens 
Murine 
leukemia virus 


WD-repeat protein 6 
reverse transcriptase 


2882 
316 


99 

49 


356 


D50617 


Saccharomyce 
s cerevisiae 


YFL042C 


279 


27 


357 
358 


T>5Q6ll 
AF161432 


Saccharomyce 
Homo sapiens 


YFLQ42C 
HSPC314 


279 


27 


359 
360 


AB029488 
AJ251024 


Homo sapiens 
Homo sapiens 


CllorCl 


1059 
758 


93 

00 


361 


U43281 


Saccharomyce j 
5 cerevisiae 


mutative odorant binding protein ag 
Lpg22p 


1239 
2074 


100 

74 


362 


U4328i i 


Saccharomyce ] 
5 cerevisiae 


Lpg22p 


2153 


74 
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SEQ 


ACCESSION 


SPECIES 


DESCRIPnON 


SAIITH" 


/O 


ID 


NUMBER 






WATERMAN 


IDENTITy 


NO: 








SCORE 




363 


AC007153 


Arabidopsis 


100632 


156 


24 






thaliaaa 








364 


AF 197927 


Homo sapiens 


AF5q31 protein 


3992 


99 


365 


D28500 


Homo sapiens 


mitochondrial isoleucine tRNA 


4286 


98 








synthetase 






366 


X97868 


Homo sapiens 


arylsulphatase 


3141 


98 


367 


AL 162048 


Homo sapiens 


hypothetical protein 


1532 


100 


368 


L36062 


Mus muscuhis 


steroidogenic acute regulatory 


189 


25 








protein 






369 


AFI 13249 


Homo sapiens 


multiple domain putative nuclear 


1022 


59 








protein 






370 


M15888 


Bos taurus 


endozepine-related protein precmor 


2425 


84 


371 


X66363 


Homo sapiens 


serine/^eonine protein kinase 


2562 


100 


372 


W74802 


Homo sapiens 


Human secreted protein encoded by 


1532 


89 








gene 73 clone HSQEL25. 






373 


AFI 00772 


Homo sapiens 


tenascin-Ml 


11535 


99 


374 


. AF090934 


Homo sapiens 


PRO0518 


382 


100 


375 


AB021643 


Homo sapiens 


gonadotropm inducible transcription 


2761 


99 








repressor-3 






376 


AB049758 


Homo sapiens 


MA WD bmding protein 


1331 


• 100 


377 


AF070666 


Homo sapiens 


Kruppel-associated box protein 


466 


97 


378 


S59342 


Mus sp. 


nuclear pore complex glycoprotein 


464 


60 








p62 






379 


AF149205 


Mus musculus 


Su(var)3-9 homolog Suv39h2 


1690 


88 


380 


AF227906 


Homo sapiens 


UDP-glucose:glycoprotein 


7851 


99 








glucosyltransferase 2 precursor 






381 


AFI 18566 


Mus musculus 


hematopoietic zinc fmger protein 


1769 


92 


382 


AK000619 


Homo sapiens 


.unnamed protein product 


810 


100 


383 


AF227906 


Homo sapiens 


UDP-glucose: glycoprotein 


7851 


99 








glucosyltransferase 2 precursor , 






384 


AE117946^ . 


Hoiio-sapiens . : 


- Uink^guaniiieiifu^^^ exchange • 


^>.-;: '2363 


100 • 








factor II 






385 


AF125390 


Drosophila 


L82G 


139 


41 






melanogaster 








386 


Y94907 


Homo sapiens 


Human secreted protein clone 


1092 


50 








cal06 19x protein sequence SEQ ID 












NO:20. 






387 


U18795 


Saccharomyce 


Yel064cp 


206 


28 






s cerevisiae 








388 


AF177388 


Homo sapiens 


cancer-amplified transcriptional 


10748 


99 








coactivator ASC-2 






389 


AJ002744 


Homo sapiens 


UDP-GalNAc:polypeptide N- 


3469 


96 








acetylgalactosaminyltransferase 7 






390 


AF097366 


Homo sapiens 


cone sodium-calcium potassium 


3166 


100 








exchanger 






391 


AF217525 


Homo sapiens 


Down syndrome cell adhesion 


5337 


60 








molecule 






392 


U81035 


Rattus 


ankyrin binding cell adhesion 


3967 


91 






norvegicus 


molecule neurofascin 






393 


X65224 


Gallus gallus 


neurofascin 


4097 


78 


394 


X13916 


Homo sapiens 


LDL-receptor related precursor (AA 


4292 


99 








-19 to 4525) 






395 


AF151083 


Homo sapiens 


HSPC249 


444 


98 


396 


ABO 17026 


Mus musculus 


oxysterol-binding protein 


2173 


98 


397 


AL035587 


Homo sapiens 


dJ475N16.4 (KIAA0240) 


2393 


100 


398 


W74813 


Homo sapiens 


Human secreted protein encoded by 


722 


. 92 








gene 85 clone HSDFV29. 






399 


Y71110 


Homo sapiens 


Human Hydrolase protein-8 


1637 










(HYDRL-8). 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


400 




v^enornaDGitis 
elegans 


contains similarity to lupus LA 
protein homologs 


325 


43 


401 


AE000877 


Methanotherm 

obacter 

thennoautotro 


conserved protein 




231 




402 
403 


Y27795 
Z50853 




Human secreted protein encoded by 

nana XTr» 

gene ino. /y. 

PT PP 


1539 


99 


405 


X03475 


Rattus 
norvcfficus 


noosomai protein JLi^a (aa l-l 10) 


615 
576 


100 
99 


406 


AF144237 


Homo sapiens 




252 


44 


407 


U20239 


Mus musculus 


fibrosin 


288 


76 


409 


AL033378 




cu JM4. 1 (KiAA0790 protem) 


6026 


99 


410 


X54326 


Homo sapiens 


glutaminyl-tRNA synthetase 


7577 


99 


411 


X61585 


Bostaurus 


polynucleotide adenylyltransferase 


3715 


97 


412 


AF217190 


Homo sapiens 


MLELl protein 


5271 


99 


414 


G02815 


nuLuu sapiens 


Human secreted protein, SEQ ID 

Pi\J. Ooyu. 


314 


95 


415 
416 


AJ245922 
AF203032 




alpha-tubulin 8 


2370 


100 


417 


Z97653 


Homo sapiens 


neurofilament protein 

c3 80 A 1.2.1 (novel protein (isoform 


220 
1567 


21 
100 


418 


AJ404326 






1871 


99 


419 


AJ404326 


nuiiiu odpicns 




902 


64 


420 


AF134726 






5334 


99 


421 


L28125 


Podospora 

oiiocx ilia 


beta transducin-like protein 


288 


39 


422 


W21733 


Homo sapiens 


NIP-1 encoded by clone 59. 


110 


72 


423 


S67970 


nuinu Sapiens 


Z,N1*75=KRAB zmc finger 


951 


76 


424.. 


L28035 


..ivjLUb muscuius 


protein kinase C gamma 


3768 




426 


T7337J*"" 


numo sapiens 


ti i KM clone 92 1 803 protein 
sequence. 


555 


56 


427 


Y73373 


l.-l.v'XlltJ OdjUiCllo 


rHKM clone 921803 protem 
sequence. 


266 


49 


428 


X61118 




1 1 ij-2a/KJ3 i N-2a 


876 


100 


429 


Z96932 


Homo sapiens 


nuclear autoantigen fo 14 kDa 


496 


83 


430 


AJ277291 


nuinu aapiens 


xibLfU protem 


678 


72 


431 


X82157 


Homo sapiens 


hevin 


3525 


99 


432 


AC007199 


xionio sapiens 


r85B_HUMAN; FrDlNS-3- 


3825 


99 


433 


AL021918 


Homo sapiens 


b34I8.1 (Kruppel related Zinc Finger 
protem ij>4j 


1713 


50 


434 


AF084464 


Rattus 

iiUX VCgl^Uo 


GTP-binding protein REM2 


141 


29 


435 


AL049795 


Homo sapiens 


dJ622L5.2 (novel protein) 


1756 


98 


436 


M14513 


P affile 

norvegicus 


CWa+ ana K+) ATPase, alpha(III) 
catalytic subunit 


4269 


99 


437 


U33460 


nv/uxu dapicns 


JJJN A-au'ected KNA polymerase I, 
largest subunit 


8777 


98 


438 


D87076 


xxyjLLlU tHipicilb 


similar to human bromodomain 
protein BR140(JC2069) 


3067 


100 


439 


1A3912 


Macaca 
mulatta 


mannose-blnding protein A 


589 


93 


440 


D31763 


Homo sapiens 


tia0946 protein is Kruppel-related. 


927 


49 


441 


U70976 


Homo sapiens i 


mestin 


2068 


99 


442 
443 


B08069 ] 
AF100662 < 


Eiomo sapiens i 

i 

Oaenorhabditis ( 


\ human beta-alanine-pyruvate 
iminotransferase (HAPA). 
jontains sunilarity to ubiquitin 


2343 
166 


99 
24 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






elegans 


carboxyl-terminal hydrolase (Pfam: 
UCH-l.hmm, score: 28.46) (Pfam: 
UCH-2.hnmi, score: 47.53) 






444 


D78017 


Rattus 
norvegicus 


NH-Al 


2667 . 


98 


445 


AL049569 


Homo sapiens 


dJ37C10.3 (novel ATPase) 


2418 


100 


448 


AJ242540 


Volvox carteri 
f. nagariensis 


hydroxyproline-rich glycoprotein 
DZ-HRGP 


165 


34 


449 


AJ133352 


Homo sapiens 


ZNF237 protein 


2006 


100 


450 


AJ133352 


Homo sapiens 


ZNF237 protein 


1025 


96 


451 


AF 170708 


Homo sapiens 


T-box protein TBX3 


3700 


99 


452 


AK002080 


Homo sapiens 


imnamed protein product 


1546 


99 


453 


L32977 


Homo sapiens 


Rieske Fe-S protein 


1239 


93 


454 


X51760 


Homo sapiens 


zinc finger protein (583 AA) 


1533 


57 


455 


Y01141 


Homo sapiens 


Secreted protein encoded by gene 7 
clone HTLFA90. 


1453 


99 


456 


AB006631 


Homo sapiens 


The human homolog of mouse Cux-2 


6559 


100 


457 


AF067165 


Homo sapiens 


zinc finger protein 3 


977 


64 


458 


AF038169 


Homo sapiens 


unknown 


154 


38 


459 


W75214 


Homo sapiens 


Human secreted protein encoded by 
gene 19 clone HRSMC69. 


. 1180 


95 


460 


U97002 


Caenorhabditis 
eiegans 


similar to acyl-CoA dehydrogenases 
and epoxide hydrolases; Pfam 
domain PF00441 (Acyl-CoA dh), 
Score=57.4, E-value=1.7e-16,N=2; 
contains similarity to Pfam domain 
PF00702 (Hydrolase), Score=57.4, 
E-value-le-13,N-l 


583 


37 


461 


AK023114 


Homo sapiens 


unnamed protein product 


1041 


99 


462 


M93134^ 


Friend murine 
Ubukemia virus 


pol protem 


289 


44 


-463 - 


AF055473 ' 


Homo sapiens 


GAGE-8 . ... 


232 


•47-" 


466 


Y51415 


Homo sapiens 


Human wild type pKe83 protein. 


2625 


100 


467 


Y51417 


787 


Human pKe83 splice variant protein 


2433 


100 


468 


Y57936 


Homo sapiens 


Human transmembrane protein 
HTMPN-60. 


1629 


96 


469 


D38552 


Homo sapiens 


The hal539 protein is related to 
cyclophilm. 


2995 


100 


470 


Y70013 


Homo sapiens 


Human Protease and associated 
protem-7 (PPRG-7). 


3530 


100 


471 


AJ224747 


Homo sapiens 


C-terminal variant of hINADL 
including 2 amino acid exchanges 
and an msertion of 28 amino acids in 
frame. 


7969 


100 


472 


W99665 


Homo sapiens 


Human secreted protein clone 
dul57_12 protein. 


1546 


100 


473 


W99665 


Homo sapiens 


Human secreted protein clone 
dul57_12 protein. 


998 


98 


474 


X63526 


Homo sapiens 


homologue to elongation factor 1- 
gamma from A.salina 


2273 


99 


475 


X15940 


Homo sapiens 


ribosomal protein L31 (AA 1-125) 


644 


100 - 


476 


M60832 


Homo sapiens 


alpha-2 type VIEE collagen 


3581 


99 


477 


AF039697 


Homo sapiens 


antigen NY-CO-31 


1213 


97 


478 


AF156929 


Sus scrofa 


inflammatory response protein 6 


1588 


83 


479 


AF264717 


Homo sapiens 


FYVE domain-containing dual 
specificity protem phosphatase 
FYVE-DSP2 


5610 


99 


480 


AF044578 


Homo sapiens 


putative DNA polymerase; P0L4P 


2478 


94 


481 


X89750 


Homo sapiens 


TGIF protein 


1413 


100 
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SEQ 
ID 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTIOMf - 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


482 


M93107 


rnnmn cam one 

nuiuu Sapiens 


(R)-3 -hydroxy butyrate 
dehydrogenase 


1663 


96 


483 


U58334 


xi\Ji.ki\j da^iciio 


UOp/jJOj^Z 


1556 


41 


484 


AF151538 




ueoxycyiiayi iransrerase; Kevlp 


4281 


99 


485 


Z98884 


Homo sapiens 


dJ467Ll.l (KIAA0833) 


699 


73 


486 


AJ243874 


xiuLnu sapisns 


oligophrenin-4 


3682 


100 


487 


Z11737 


Homo sapiens 


flavin-containing monooxygenase 4 


2969 


100 


488 


X56123 


Mus musculus 


talin 


4353 


77 


489 


•iVJx. lOl 1^ 


Homo sapiens 


putative cell cycle control protein 


335 


23 


490 


»T /tot' J 


Homo sapiens 


Human secreted protein encoded by 
gene i clone HOVBA03. 


1013 


98 


491 


Y41337 


Homo sapiens 


Human secreted protein encoded by 
gene 30 clone HRDDV47. 


509 


36 


492 


X90530 


Homo sapiens 


ragB 


1926 


99 


493 




Homo sapiens 


raga 


1405 


99 


494 




Homo sapiens 


ragJhJ 


1893 


96 


495 


AL022394 


Homo sapiens 

■-— : 


dJ511B24.3 (KIAA0395 (probable 
homeobox protein)) 


4990 


99 






Homo sapiens 


lanthionine synthetase C-like protein 
1 


2168 


100 


4Q7 




Homo sapiens 


Ribosomal protein kinase B (RSK-B) 


4001 


100 


498 


G01563 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 5644. 


330 


100 


499 


X54131 


Homo sapiens 


protein-tyrosine phosphatase 


10465 


99 


500 


G01082 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 5163. 


549 


100 






Homo sapiens 


similar to murine leucine-rich repeat 
protein; possible role in neural, 
development by protein-protein 
interactions; 93% similarity to 
D49802 (PlD:gi369906).— rrrr - 


3676 


100 


502 


ALl 17544 


Homo sapiens 


hypothetical protein ' ; - * - 


1226 


100' 


503 


AF203032 


Homo sapiens 


neurofilament protein 


5115 


99 


504 


AL034417 


Homo sapiens 


bK215DlL2 (similar to rat gene 33) 


2476 


100 


505 


X69090 


Homo sapiens 


190kD protein 


7546 


99 


506 


U58755 


Caenorhabditis 
elegans 

— : 


coded for by C. elegans cDNA 
ylc34bl.5; coded for by C. elegans 
cDNA ykl3hl0.5; coded for by C. 
elegans cDNA yk46e8.5; coded for 
by C. elegans cDNA yk46d5.5; 
coded for by C. elegans cDNA 
yk43c2.5; coded for by C. elegans 
cDNA yk46e8.3; coded for by C. 
elegans cDNA yk43c2.3; coded for 
by C. elegans cDNA yk46d5.3; 
coded for by C. elegans cDNA 
ykl3fl0.3; coded for by C. elegans 
cUNAyk34Dl.3 


782 


55 


507 




Homo sapiens 


NHP2 protein 


801 


100 


508 


U39045 


P satfiic 
IvalLUS 

USJI V CglC US 


cytoplasmic dynein intermediate 
cnain zi5 


3241 


97 


509 


AF063231 


Mus musculus 


cytoplasmic dynein intermediate 
cnam 2 


3159 


97 


510 


AF202893 


Mus musculus 


KifZlb 




yj 


511 


Y13115 


Homo sapiens 


serine/threonine protein kinase 


5071 


99 


512 


AB030207 


Homo sapiens 


G gamma subunit 


364 


100 


513 


AF039571 


Homo sapiens 


peripheral benzodiazepine receptor 
mteracting protein; PBR-IP/PRAXl 


495 


33 


514 


AB037883 


Homo salens | Ub3/CD77 synthase 


1916 


99 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 






OiTU 1 D* 

WATERMAN 
SCORE 


o/ 

IDENTITY 


515 


D90868 


Escherichia 
coli 


similar to 


1489 


100 


516 


X98834 


Homo sapiens 


zinc finger protein HsaJ2 


5290 


100 


517 


AF055668 


Mus musculus 


apoptosis-iinked gene 4, deltaC form 


. 2904 


78 


518 


AF019926 


Mus musculus 


pjtx>tein kinase 


1694 


90 


519 


M34513 


Homo sapiens 


omega protein 


317 


91 


520 


Y08612 


Homo sapiens 


88kDa nuclear pore complex protein 


2313 


99 


521 


Y08612 


Homo sapiens 


SSkDa nuclear pore complex protein 


1561 


99 


522 


AL096766 


Homo sapiens 


dA59Hl 8. 1 {KIAA0767 protein) 


2497 


100 


523 


AFl 86249 


Homo sapiens 


six transmembrane epithelial antigen 
of prostate 


1790 


100 


524 


AB029012 


Homo sapiens 


KIAA1089 protein 


4933 


100 


525 


AB026893 


Homo sapiens 


vascular cadhenn-2 


5962 


100 


526 


X74331 


Homo sapiens 


DNA priinase (p58 subunit) 


1720 


100 


528 


AC007228 


Homo sapiens 


R31665 2 


1488 


47 


529 


X14830 


Homo sapiens 


acetylcholine receptor beta-subunit 
preprotem 


2639 


100 


530 


U80446 


Caenorhabditis 
elegans 


coded for by C. elegans cDNA 
ykl72e6.3; coded for by C. elegans 
cDNA ykl58f73; coded for by C. 
elegans cDNA ykl58f7.5; coded for 
by C. elegans cDNA ykl72e6.5 


420 


39 


531 


S76838 


Mus sp. 


Dbs 


4821 


88 


532 


Z82215 


Homo sapiens 


dJ6802.2 (myosin, heavy 
polypeptide 9, non-muscle) 


9828 


.100 


533 


AF245505 


Homo sapiens 


adlican 


277 


31 


534 


AF300612 


Homo sapiens 


N-acetylgalactosamine-4-0- 
suifotransferase 


993 


59 


535 


AL121928 


Homo sapiens 


bA18I14.3 (pleckstrin and Sec7 
domain protein) 


3333 


99 


536 


.AJ271G55 ; 


Mus musculus^ „ 


irpjqupisAprn . j: * 


1724 


76 


537 


AFl 80473 


Homo sapiens 


Not2p. 


2267 


100 


538 


AF071059 


Mus musculus 


zinc finger RNA binding protein 


1089 


. 51 


539 


AF023453 


Homo sapiens 


actin-related protein 3-beta 


2219 


100 


540 


AC003030 


Homo sapiens 


R29828 1 


1401 


70 


541 


AC003030 


Homo sapiens 


R29828 1 


2294 


100 


542 


AL121889 


Homo sapiens 


dJ1076E17.1 (KIAA0823 protein 
(continues in AL023803)) 


2152 


100 


543 


AB006135 


Rattus 
norvegicus 


db83 


1238 


98 


544 


G02650 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 6731. 


644 


97 


545 


Y07595 


Homo sapiens 


transcription factor TFIIH 


2373 


100 


546 


AL133545 


Homo sapiens 


bA386N14.1 (novel protem similar 
to a dual specificity phosphatase) 


964 


99 


547 


X836I8 


Homo sapiens 


hydroxymethylglutaryl-CoA 
synthase 


2647 


100 


548 


AF134726 


Homo sapiens 


NG37 


4359 


99 


549 


AB035356 


Homo sapiens 


neurexin I-alpha protein 


6948 


99 


551 


AB037901 


Homo sapiens 


gene amplified in squamous cell 
carcinoma-1 


5215 


99 


552 


AB043634 


Homo sapiens 


PAR-6A 


885 


100 


553 


AP000693 


Homo sapiens 


partial CDS 


•4875 


99 


554 


AF002223 


Homo sapiens 


myotubularin related 1 


3490 


100 


555 


AC004893 


Homo sapiens 


similar to NEDD-4 (KIA0093); 
similar to P46934 (PID:gll71682) 


1611 


100 


556 


AJ404468 


Homo sapiens 


axonemal dynein heavy chain 


8328 


100 


557 


AJ404468 


Homo sapiens 


axonemal dynein heavy chain 


11137 


100 



140 



wo 01/57190 
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SEQ 

LU 

NO: 


ACCESSION 
NUMBER 


SFKCIES 


DESOUFIION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


558 


X65873 


Homo S3.piens 




4860 


100 


559 
560 


AJ277365 
AF205600 


Homo sapiens 

Homn <ia'nipnQ 


polyglutamine-contaming protein 


592 


36 


561 


' X71125 


^tLil \J O u L/X wild 

Homo sapiens 


u cui&posase-uKe proiem 
glutaminyl-peptide cyclotransferase 


407 
1914 


27 

100 . 


562 


X71125 




glutaminyl-peptide cyclotransferase 


1456 


97 


563 


X54304 


Homo sapiens 


myosin regulatory ligjit chain 


897 


100 


564 


AF250842 


Drosophila 

TTt P 1 St n cr 9 ci' Af* 

iiiciaiiogadier 


multiple asters 


130 


23 


565 


Y58608 


Homo sapiens 


Protein regulating gene expression 

JrKvjb-1. 


1619 


99 


566 


AL 12 1893 


nuLQu Sapiens 


DA189K21.5 (novel protein similar 
to retinoblastoma binding protein 


1012 


100 


567 


ALl 17352 




ajo/oi5iu.z (novel protem (ortholog 

of rat PYORd.^^ 


3713 


99 


568 


AF228603 


Homo sapiens 


pleckstrin 2 


1841 


100 


569 


AF239243 




iiistone deacetylase 7 


3244 


86 


570 


AF087695 


TTlllCPIllllC 

inUiS lli|4d(./llJ.Lld 


veil 0 


989 


100 


571 


AB046381 


Homo sapiens 


testis-abundant finger protein 


1346 


99 


572 


AC005551 


xTumu sc^iens 


K20529 2, partial CDS 


1020 


100 


573 


Y90290 


Homo sapiens 


Human peptidase, HPEP-7 protein 
sequence. 


274 


52 


574 


W76734 


xiuiao sapiens 


Human mDia Rho targeting protein. 


712 


32 


575 


AL121935 


xxuiuu £>cipicns 


DAD17H2.3 (t-complex 10 (amunne 
icp.aOTJiOiog)) 


853 


78 


576 


Y86217 


n c am An c 
nuiuu aopicild 


Human secreted protein HWHGU54, 


2123 


99 


577 


AL121716" 


Homo sapiens 


dJ202D23.2 (novel protein) 


6329 


99 


578 


AL121716 




ajzuzUzi./ (novel protem) 


6329 


99 


579 


X92715 


rTnTttO CftniAnc 


ivKAJB /C2H2 zmc finger protein 


3102 


97 


580 


rX34637^:T* 




protein tyrosine kinase 


5564 


98- " 


581 


' X78817 


Homo sapiens 


3115 


1148 


44 


582 


AJ251245 


ixaiius 

norvegicus 


oJbClb omding protem 2 


3086 


71 


583 


AF113125 


±±wui\j aaJpiC/ilS 


E-1 enzyme 


581 


100 


584 


Ml 9529 


OUa SCrOIa 


loiiistaun A 


1906 


98 


585 


AF169677 


Homo sapiens 


leucine-rich repeat transmembrane 
protem tLKT3 


3403 


100 


586 


D87685 


Homo sapiens 


similar to human transcription factor 
irllb (o34159). 


8083 


99 


587 


Y00876 


Homo sapiens 


Human LAPH-1 protein sequence. 


2110 


100 


588 


Y99674 


xiumu sapiens 


Human GTPase associated protein- 
25. 


2111 


99 


589 


D86973 


nujxLU ddpicns 


similar to Yeast translation activator 
Lrt^rMl (rl.A4ol2o) 


12033 


99 


590 


AL034452 


Homo sapiens 


dJ682J15.1 (novel Collagen triple 
helix repeat containing protein) 


1979 


100 


591 


Y57396 


Homo sapiens 


Human lysoenzyme LYC4 
poiypepnae. 


814 


100 


592 


AJ297743 


Mus musculus 


torsinB protein 


1448 


85 


593 


AF164796 


xiuLuu dctpicns 


NADH:ubiquinone oxidoreductase 
MLRQ subunit homolog 


469 


100 


194 


Y41312 


Homo sapiens 


Human secreted protein encoded by 
gene 5 clone HLDRM43. 


749 


94 


595 


Y41312 


Homo sapiens 


Human secreted protein encoded by 
gene 5 clone HLDRM43. 


824 


100 


596 
597 


Y77123 
AF215703 ] 


Homo sapiens 

1 

Drosophila ] 


Human neurotransmission-associated 
protein (NTAP) 998868. 
CISMET-L long isoform 


2102 
1880 


98 
65 



141 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






melanogaster 








598 


AF070447 


Homo sapiens 


barrier-to-autointegration factor 


290 


90 


599 


X56203 


Plasmodiiun 
falciparum 


liver stage antigen 


372 


22 


600 


X79828 


Mus musculus 


NKIO 


202 


53 


601 


AB004109 


Cricetulus 
griseus 


phosphatidylsenne syn&ase II 


2262 


92 


602 


U94988 


Mus musculus 


Nulpl 


2912 


89 


603 


U94988 


Mus musculus 


Nulpl 


2800 


86 


604 


AF006264 


Homo sapiens 


recombination and sister chromatid 
cohesion protein homolog 


2850 


100 


605 


AF006264 


Homo sapiens 


recombination and sister chromatid 
cohesion protein homolog 


2530 


100 


606 


X82260 


Homo sapiens 


RanGAPl 


2929 


100 


607 


X82260 


Homo sapiens 


RanGAPl 


1843 


97 


608 


AF160909 


Drosophila 
melanogaster 


BCDNA.LD03471 


943 


58 


610 


X74801 


Homo sapiens 


gamma subunit of CCT chaperonin 


2745 


99 


611 


AL031427 


Homo sapiens 


dJ167A19.1 (novel protein) 


1608 


100 


612 


Y71072 


Homo sapiens 


Human membrane transport protein, 
MTRP-17. 


445 


100 


613 


X16396 


Homo sapiens 


precursor polypeptide (AA -29 to 
315) 


1749 


100 


614 


AK000281 


Homo sapiens 


unnamed protein product 


1814 


99 


615 


AB011128 


Homo sapiens 


KIAA0556 protein 


5761 


99 


616 


U19361 


Petromyzon 
marinus 


NF-180 


205 


21 


617 


.AF045555 


Homo sapiens | wbscrl 


1208 


, 100 


618 


AF045555 


Homo sapiens 


wbscrl alternative spliced product 


1318 


100 


619 


U22229 


Felis catus 


ribosomal protein L41 


128 


100 


! ^620%^ 


Y 17 169 :^;: 


-Homo sapiens 


A6 related protein . .; - \::^r 


. !;*-1819 




621 


. Y12065 


Homo sapiens 


hNop56 


2956 


. 99 


.622 


AF177758 


Homo sapiens 


ubiquitin specific protease 16 


2998 


100 


623 


AF3 17425 


Homo sapiens 


GAC-1 


3866 


100 


624 


AL050297 


Homo sapiens 


hypothetical protein 


1227 


99 


625 


AC007204 


Homo sapiens 


BC273239 1 


3398 


99 


626 


Z68747 


Homo sapiens 


imogen 38 


2024 


99 


627 


Z68747 


Homo sapiens 


Imogen 38 


1958 


97 


628 


Y70229 


Homo sapiens 


Human RNA-associated protein-10 
(RNAAP-10), 


3424 


99 


629 


AF191492 


Homo sapiens 


nasopharyngeal carcinoma associated 
gene protem-8 


613 


100 


630 


AFl 19664 


Homo sapiens 


transcriptional regulator protein 
HCNGP 


1574 


100 


631 


AF 11 9664 


Homo sapiens 


transcriptional regulator protein 
HCNGP 


1150 


89 


632 


Y17849 


Homo sapiens 


ganglioside-induced differentiation 
associated protein 1 


1839 


98 


633 


X55740 


Homo sapiens 


5-nucleotidase 


3012 


100 


634 


AF039688 


Homo sapiens 


antigen NY-CO-3 


931 


100 


635 


AFl 19662 


Homo sapiens 


E46 protein 


2424 


100 


636 


AB007836 


Homo sapiens 


Hic.5 


2544 


100 


637 


AF077818 


Mus musculus 


syntrophin-associated serine- 
threonine protem kinase 


2027 


44 


638 


AL035455 


Homo sapiens 


dJ1018E9.1 (VAMP (vesicle- 
associated membrane protein)- 
associated protein B and C) 


150 


26 


639 


AF078844 


Homo sapiens 


hqp0376 protein 


416 


81 



142 
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SEQ 
ID 
NO: 



640 



ACCESSION 
NUMBER 



U28377 



641 
642 



AK024442 



U58682 



SPECIES 



Escherichia 
coli 



Homo sapiens 



Homo sapiens 



DESCRIPTION 



ORP_f239; was ORF__n91 and 
ORF_fl 94 before spUce 



FLJ00032 protein 



ribosomal protein S28~^ 



SMITH. 
WATERMAN 
SCORE 



1198 



1677 



340 



% 

IDENTITY 



100 



56 
100 



Rattusrattus 



644 



AB002348 



646 



Y96202 



Homo sapiens 



ribosomal protein S2 



Homo sapiens 



KIAA0350 protein 



1520 



647 
648 



AB029482 



Mus musculus 
Arabidopsis 



IkappaB kmase (DCK) binding 
protein, Y2H56. 



5186 



1178 



JNK-binding protein JNKBPl 



4609 



98 
99 



98 



81 



AB009053 



thaliana 



650 
651 



AC002550 
U26592 



Homo sapiens 



contains similarity to isoamyl 

acetate-hydrolyzing 

esterase-geneJd:MQB2^5 



407 



652 



X60155 



Homo sapiens 



Unknown gene product 



653 



X53330 



Homo sapiens 



Platynereis 
dumerilii 



654 
655 



AC003682 
X80473 



Homo sapiens 



Mus musculus 



diabetes mellitus type I autoantig en 
zinc finger 41 " 
H4 protein (AA 1 



858 



103) 



R27945 2 



rabl9 



253 



4349 



523 



2558 



596 



44 



99 



66 



100 



100 



100 



56 



J02649 



Rattus 
norvegicus 
Homo sapiens 



unknown protein 



201 



95 



AC006014 



similar to RFP transforming protein; 
similar to P14373 (PID:gl32517) 



1331 



99 



X92972 



659 



L35269 



Homo sapiens 



Homo sapiens 



protein phosphatase 6 



zinc finger protein 



1666 



2803 



100 



99 



AC003682 



661 



X79204 



Homo sapiens 



F18547 1 



662 



X17620 



Homo sapiens 



ataxin-1 



3184 



Homo sapiens 



Nm23 protein 



4195 



965 



96 



99 



99 



ABD15617 



664 



Z56281 



Homo sapiens 



665 



AJ248283 



666 



Homo sapiens 

Pyrococcus 

abyssi 



667 
668 



Z70200 



Z70200 
AF153450 



Homo sapiens 



ELKS 



interferon regulatory factor 3 



1501 



LACTOYLGLUTATHIONE 
LYASE (EC 4.4. L5) ' 
METHYLGLYOXALASE) 
(ALDOKETOMUTASE) 
(GLYOXALASEO. 



2331 



254 



Homo sapiens 



US snRNP-specific 200kD protein 



U5 snRNP-specific 200kD protein 



8819 



8589 



80 



100 



40 



99 



97 



Manduca sexta 



669 
670 



AF227198 
X99586 



Homo sapiens 



juvenile hormone esterase binding 
protein 



225 



CrkRS 



671 



Z61589 cdl 



Homo sapiens 



Homo sapiens 



$MT3C protein 



7231 



17-AUG-1998 DNA encoding a 
human OC-2 protein. 



441 



2593 



32 



99 



87 



100 



AJ132702 



673 



Mus musculus 



AF204159 



Homo sapiens 



ATFa-associated factor 



674 



G02061 



potassium large conductance 
calcium-activated channel beta 3a 
subunit 



3240 



1486 



Homo sapiens 



Human secreted protein, SEQ ID 

NO: 6142. 



558 



88 



100 



99 



G01246 



Homo sapiens 



676 
677 



AB016839 
D86970 



Human secreted protein, SEQ ID 
NO: 5327, 



141 



Homo sapiens 



mobl 



419 



77 



42 



Homo sapiens 



678 



679 



U83115 
AF203687 



Homo sapiens 
Homo sapiens 



similar to myosin heavy chain; 
Containing ATP/GTP-binding site 
motif A(P-loop) 



161 



non-lens beta gamma-crystallin like 

protein 



8569 



prolactin regulatory element-binding 
protein 



2181 



28 



99 



100 



143 
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SEQ 
ID 
NO: 


ACCESSION 

NUMBER 


SPECIES 


DESCRIPTION 


^MITH- 
WATERMAN 
SCORE 


% 

iDENTrrv 


680 


M27685 


Mus musculus 


ultra-high sulphur keratin 


650 


58 


681 


U04968 


Cricetulus 
griseus 


nucleotide excision repair protein 


3712 


97 


682 


AFI 19663 


Homo sapiens 


G-protein gamma- 12 subunit 


356 


100 


683 


G03733 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 7814. 


342 


100 




Ao7o99 


Homo salens 


CDw52 antigen 


297 


100 


685 


AF022789 


Homo sapiens 


ubiquitin hydrolyzmg enzyme I 


1892 


100 


686 


AJ001006 


Mus musculus 


£Meg32 protein 


938 


96 


687 


W03516 


Homo sapiens 


Prostaglandm DP receptor. 


1864 


100 


688 


AF019661 


Mus musculus 


zeta proteasome chain; PSMA5 


1214 


100 


689 


AFI 56557 


Homo sapiens 


stomatin related protein 


2036 


100 


690 


G03960 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 8041. 


593 


100 


691 


AF161512 


Homo sapiens 


HSPC163 


738 


100 


692 


AL031115 


Homo sapiens 


ZXDA, ZXDB (zinc finger X-linked 
protein) 


4298 


100 


693 


L40410 


Homo sapiens 


thyroid receptor interactor 


806 


100 


694 


AC004542 


Homo sapiens 


OXYSTEROL-BINDING 
PROTEIN-like; simUar to P22059 
(PID:gl29308) 


2533 


99 


695 


AF169411 


Rattus 
norvegicus 


PAPIN 


4144 


52 


696 


Y58168 


Homo sapiens 


Human hydrolase homologue HHH- 
4. 


2144 


100 


697 


AF271994 


Homo sapiens 


dopamine responsive protein DRG-1 


1613 


100 


698 


Y41741 


Homo sapiens 


Human PRO704 protein sequence. 


1323 


100 


699 


AL133506 


Unknown 


/prediction=(niethod:""genscan"", 
version:"" 1.0"", score:"" 109. 13""); 
/prediction={method: 


825 


48 


700 , 


; Y96870: 


'Homo sapiens 


Humajigoose^fypelysozyine * - • 

'^(GpLt);' ; ^ ^ • . ^ 


- 1032 - . 


100 • ^ 


701 


AC003034 


Homo sapiens 


Gene with similarity to rat kidney- 
specific (KS) gene 


1190 


100 


702 


AC003034 


Homo sapiens 


Gene with similarity to rat kidney- 
specific (KS) gene 


937 


95 


703 


AJ242832 


Homo sapiens 


calpain 


3756 


100 


704 


S52624 


Homo sapiens 


unknown 


185 


ICQ 


705. 


AF005081 


Homo sapiens 


skm-specific protem 


652 


100 


706 


Y16793 


Homo sapiens 


keratin, type I • 


2232 . 


100 


707 


Y44985 


Homo sapiens 


Human epidermal protein-2. 


455 


69 


708 


AFI 13220 


Homo sapiens 


MSTP040 


686 


100 


709 


Y44985 


Homo sapiens 


Human epidermal protein-2. 


408 


65 


710 


Y16132 


Homo sapiens 


CDT6 


1874 


100 . 


711 


Y6S115 


Homo sapiens 


Amino acid sequence of a human 
phosphorylation effector PHSP-7. 


2407 


100 


712 


A03422 


Homo sapiens 


H(+)-transportmg ATP synthase 


209 


100 


/13 


Arlo99oo 


Mus musculus 


DNA bmdmg protem DESRT 


1467 


79 


714 


X52563 


Bostaunis 


permability increasing protein 


383 


29 


715 


AJ277739 


Homo sapiens 


RPB 1 1 b 1 alpha protein 


480 


98 


716 


AL135791 


Homo sapiens 


bA 1 62G 1 0.3 (zinc finger protein) 


401 


98 


717 


AF223466 


Homo sapiens 


HTO 15 protein 


1311 


97 


719 


AF117383 


Homo s^iens 


placental protein 13; PP13 


746 


100 


720 


Z98743 


Homo sapiens 


dJ181C9,2 (Rho GTPase activating 
protein 8 (RhoGAP, p50RhoGAP)) 


324 


100 


721 


AL163815 


Arabidopsis 
thaliana 


putative protein 


653 


61 


722 


G01436 


Homo sapiens 


Human secreted protein, SEQ ID 


418 


96 



144 
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SEQ 

n> 

NO: 



723 



724 
725 



726 



ACCESSION 
NUMBER 



AF282919 



Mus musculus 



AB023191 Homo sapiens 



AL03 1 778 Homo sapiens 



AL021939 



727 



728 



AFl 82426 



Y08565 



Homo sapiens 



Rattus 
norvegicus 



Homo sapiens 



DESCRIPTION 



NO: 5517. 



Z^228 



KIAA0974 protein 



dJ34B21.1 (novel BZRP 
(benzodiaz25)ine receptor (peripheral) 
(MBR, PBR, PBKS, IBP, 
Isoquinoline-binding protein)) LIKE 
protein) 



SMITH- 
WATERMAN 
SCORE 



349 



2953 



920 



dJ352A20.2 (aldehyde 
dehydrogenase family protein) 



arylacetamide deacetylase 



UDP-GaINAc:polypeptide N- 
acetylgalactosaminyltransferase 



1764 



791 



3331 



% 

IDENTITY 



49 



100 



100 



100 



42 



99 



Homo sapi^ 



730 



AL078606 



731 
"732" 



Y73352 
AF178432 



Arabidopsis 
thaliana 



novel retinal pigment epithelial cell 
protein 



1652 



putative protein 



277 



Homo sapiens 



HTRM clone 1732368 protein 
sequence. 



1720 



99 



55 



100 



733 



Y17832 



Homo sapiens 



734 



Human 

endogenous 
retrovirus K 



SH3 protein 



env protein 



3302 



223 



Y28859 



Homo sapiens 



735 



U09355 



Oryctolagus 
cuniculus ^ 



736 



Y94922 



Homo sapiens 



737- 



738 



739 



740 



741 



742 



743 



744 



745 
746 



-ABQ27Q03-- 



AFri2200 



-Mus musculus-. 



AF302154 



B25681 



L27479 



L27479 



Y66745 



AJ001019 
X6S453 



Hoino sapiens 



Homo! 



sapiens 

Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens ■ 



Human mesoderm induction early 
response protein ERl. 



2067 



protem phosphatase 2A1 B gamma 
subunit 



2352 



Human secreted protein clone pv6 1 
protein sequence SEQ ID NO:50, 



724 



protein phosphatase 



NADH-oxidoreductase B 1 8 subunit 



378 



NADH-oxidoreductase B18 subunit 



739 



SPG protein 



613 



Human secreted protein sequence 
encoded by gene 17 SEQ ID NO:70. 

X123 



6556 



X123 



1410 



1237 



Membrane-bound protein PRO 1 186. 



ring finger protein 



1206 



588 



1292 



100 



34 



98 



99 



99 



84 



100 



88 



100 



99 



99 



97 



99 



99 



747 



Y57897 



748 
749 



AF151069 
AF182404 



Susf 



scrofa 
Homo sapiens 



tubulin-tyrosine ligase 



Homo sapiens 



Human transmembrane protein 
HTMPN-21. 



1882 



1173 



HSPC235 



Homo sapiens 



mitochondrial uncoupling protein 1 



1694 



1674 



94 



100 



96 



751 
752 



AF149825 
AL008635 



Homo salens 



Homo sapiens 
Homo sapiens 



dJ776P7.1 (Novel protein) 



PACSIN3 



2500 



2253 



99 



753 



Y57914 



Homo sapiens 



dJ510H16.2 (high-mobility group 
protein 2-Iike 1) 



3026 



Human transmembrane protein 
HTMPN-38. 



1124 



99 



100 



754^ 
755 



AF285109 
AF004161 



Homo sapiens 
Oryctolagus 



septin 3 isoform B 



1766 



756 
757 
758 



cuniculus 



219585 
AP001745 



peroxisomal Ca-dependent solute 
carrier 



2371 



759^ 
760 



AF190664 
AF090326 



Homo sapiens 
Homo sapiens 



thrombospondin-4 



Mus musculus 
Mus musculus 



similar to zinc finger 5 pr otein 
LMBR2 



4239 



1857 



AL096677 



Homo sapiens 



AE-1 binding protein AEBP2 



555 



1540 



dJ322G13.3 (novel protein similar to | 999 



95 



100 



100 



72 



97 



94 



145 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 








bovine and mouse beta-soluble NSF 
attachment protein (SNAP-beta) ) 






761 


AC003007 


Homo sapiens 


Unknown gene product (partial) 


649 


96 


762 


U66372 


Bos taurus 


ribosomal protein S29 


230 


73 


764 


Y90899 


Homo sapiens 


Dl-like dopamine receptor activity 
modifying protein SEQ ID NO: 1 . 


1152 


100 


765 


U88169 . 


Caenorhabditis 
elegans 


similar to molybdoterin biosynthesis 
MOEB proteins 


1204 


65 


766 


ALl 18506 


Homo sapiens 


dJ591C20.3.1 (novel DnaJ domain 
protein, similar to mouse and bovine 

cysteine string protein) 


1091 


100 


161 


AK024693 


Homo sapiens 


unnamed protein product 


3767 


100 


768 


Z11518 


Homo sapiens 


histidyl-tRNA synthetase 


2582 


100 


769 


X13916 


Homo sapiens 


LDL-receptor related precursor (AA 
-19 to 4525) 


25529 


100 


770 


AC009360 


Arabidopsis 
thaliana ' 


Contains 3 PF|00400 WD40, G-beta 
repeat domains. 


333 


33 


771 


AB037685 


Mus musculus 


LANP-like protein 


1246 


91 


772 


AL161578 


Arabidopsis 
thaliana 


putative protein 


335 


46 


773 


AL161578 


Arabidopsis 
thaliana 


putative protein 


333 


47 


774 


AY008271 


Homo sapiens 


helicase SMARCADl 


5264 


99 


lis 


Y21591 


Homo sapiens 


Human secreted protein (clone 
CC332-33). 


1127 


96 


116 


W88853 


Homo sapiens 


Polypeptide fragment encoded by 
gene 89. 


752 


100 


111 


W88853 


Homo sapiens 


Polypeptide fragment encoded by 
gene 89. 


752 


100 


in 

^±1- -I'M 


W88853 


Homo sapiens 


Polypeptide fragment encoded by 
gene 89. , . 'r'^"'^" 


752 


100 




AF196481 


Homo sapiens 


RING finger protein; FXY2 


3644 


100 ^ " 


780 


AL035427 


Homo sapiens 


dJ769N13.1 (KIAA0443 protein.) 


1609 


54 . 


781 


AB026187 


Homo sapiens 


protocadherin-Xa 


5244 


100 


782 


B24458 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 22 SEQ ID NO: 83. 


1002 


100 


783 


AB027289 


Homo sapiens 


cyclin-E binding protein 1 


5421 


100 


784 


G02916 


Homo sapiens 


Human secreted protein, SEQ ED 
NO: 6997. 


627 


100 


785 


AJ245822 


Homo sapiens 


type I transmembrane receptor 


4560 


100 


786 


AJ245820 


Homo sapiens 


type I transmembrane receptor 


4624 


100 


787 


Z48042 


Homo sapiens 


GPI-anchored protein pi 37 


3340 


99 


788 


AL031782 


Homo ss^iens 


dJ708F5.1 (PUTATIVE novel 
Collagen alpha 1 LIKE protem) 


2739 


100 


TOO 


A T1 O 1 Oil C 

AJ 131245 


Homo sapiens 


Sec24B protein 


6602 


100 


790 


AF 107203 


Homo sapiens 


ataxin 2-binding protein 


2008 


100 


791 


Y14690 


Homo sapiens 


procollagen alpha 2(V) 


600 


34 


792 


A T Al 1 Af ^ 

AL031055 


Homo sapiens 


dJ28H20.2 (novel protein) 


1267 


100 


793 


Y36194 


787 


Human secreted protein 


2051 


99 


794 


AB028I27 


Homo sapiens 


mannosyltransferase 


2138 


96 


795 


AC007228 


Homo sapiens 


R31665 2 


2738 


79 


796 


AL049482 


Arabidopsis 
thaliana 


putative protein 


436 


47 


797 


AC004528 


Homo sapiens 


R32184 3 


891 


91 


798 


AB037830 


Homo sapiens 


KIAA1409 protein 


7532 


TOO 


799 


X53793 


Homo sapiens 


5' half of the product is homoJogues 
to Bacillus subtiis SAICAR 
synthetase, 3* half corresponds to the 
catalytic subunit of AIR carboxylase 


2232 


100 



146 
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SEQ 

n> 

NO: 



800 



801 

802 



ACCESSION 
NUMBER 



Y99350 



AB042636 
AB029324 



SPECIES 



Homo sapiens 



Homo sapiens 



DESCRIPTION 



Human PR01378 {UNQ715) amino 
acid sequence SEQ ID NO:33. 



junctophilin type3 



SMITH- 
WATERMAN 
SCORE 



1343 



1225 



% 

IDENTITY 



100 



47 



Rattus 
norvegicus 



TIP120-family protein TIP120B 



3916 



90 



804 



AF251040 



805 



AB033281 



806 



U87305 



Rattus 
norvegicus 



TIP120-famiIy protein TIP120B 



4961 



Homo sapiens 
Homo sapiens 



putative nuclear protein 



F-box and WD-repeats protein beta- 
TRCP2 isofonn C 



2119 



2879 



Rattus 
norvegicus 



transmembrane receptor UNC5H1 



3257 



90 



100 



100 



90 



AF118889 



Rattus 
norvegicus 



b-tomosyn isoform 



3155 



97 



AF226993 



Rattus 
norvegicus 
Homo sapiens 



selective LIM binding fector 



8793 



95 



W19919 



810 



AL03I782 



Homo sapiens 



Human Ksr-1 (kinase suppressor of 
Ras). 



3939 



811 



AC002542 



Homo sapiens 



dJ708F5J (PUTATIVE novel 
Collagen alpha 1 LIKE protein) 



1546 



812 



U83246 



813 



AF242552 



Homo sapiens 



similar to C. elegans Fl 1A10.5; 80% 
similarity to Z68297 (PID:gl 130619) 



2294 



814 
815 



X52332 



Gallus gallus 



copine I 



606 



retmovm 



Homo sapiens 



zinc finger protein 10 



945 



1651 



99 



100 



100 



52 



34 



93 



X52332 



816 



Y09631 



Homo sapiens 



817 



X71997 



Homo sapiens 



zinc finger protein 10 



818 



819 



AY004877 



Rattus 
norvegicus 



PIBFl protein 



2423 



niyosin I 



2935 



3883 



Mus musculus 



Y27196 



Homo sapiens.. 



cytoplasmic dynein heavy chain 
Human cyclic nucleotide 
phosphodiester PDE8B(E)lmini 
acid sequence. 



11105 



3790- 



99 



99 



98 



98 



100 



AF081947 



Mus musculus 



tektin 



1134 



81 



AL035106 



Homo sapiens 



dJ998ClLl (continues in 
Em:AL445192 as bA269H4,l) 



871 



100 



822 



AF022795 



Homo sapiens 



823 



AF015770 



824 



U82695 



825 



X77371 



826 
827 



AB014576 
AL049733 



Mus musculus 



Homo sapiens 

Mesocricetus 

auratus 



Homo sapiens 
Homo sapiens 



TGF beta receptor associated protein- 



385 



radical fiinge 



expressed"Xq28STS protein 



1422 



CORl 



1444 



641 



KIAA0676 protem 



296 



24 



82 



99 



78 



79 



828 



AF222980 



829 
830 



Z31560 
AF295773 



Homo sapiens 
Homo sapiens 



dJ875H3.1 (APKl antigen) 



disrupted in Schizophrenia 1 protein 



sox-2 



1584 



4418 



1683 



72 



100 



100 



Homo sapiens 



ral guanine nucleotide dissociation 
stimulator 



4717 



99 



AB041926 



832 



L04948 



833 



Homo sapiens 
Saccharomyce 
s cerevisiae 



AJ007012 



834 



Mus musculus 



Z34289 



Homo sapiens 



GCK family kinase MINK-2 
mitochondrial transporter protein 



6m 



338 



Fish protein 

nucleolar phosphoprotein pl30 



704 



3455 



100 



35 



94 



99 



U10991 



Homo sapiens 
Homo sapiens 



G2 



8436 



98 



836 
837 



AF230877 
X58288 



MIP-T3 



2945 



99 



838 



X56958 



839 



AC024791 



Homo sapiens 



Homo sapiens 

Caenorhabditis 

elegans 



protem-tyrosine phosphatase 



7734 



ankyrin (brank~2) 



9631 



contains similarity to beta-lactamases 



370 



99 



100 



24 



147 



wo 01/57190 



PCT/USOl/04098 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMira- 
V^ATERMAN 
SCORE 


% 

IDENTITY 






Homo sapiens 


ankyrin repeat protein 


802 


99 




ArVjDf 11 


Seiinus 

Caiiaria 


neurofilament medium subunit 


192 


31 


g42 


AF283772 


Homo sapiens 


similar to Homo sapiens ribosomal 
protein LIO encoded by GenBank 
Accession Number L25899 


990 


96 




U /0j4j 


Homo sapiens 


GABA transport protein 


2992 


98 




I 1 j04D 


Homo sapiens 


uroplakin II 


897 


100 


Of J 


X^ZiU04 


Homo sapiens 


similar to rat general mitochondrial 
matrix processmg protease mivN a 
n? AT^yfPP^ 

^i\A 1 IVJJr ) . 


2710 


99 


846 


AF107S99 


TJnnn^ pom an p 

xiuiiiu sap lens 


iNiemann-jricK k^j proiem, jnjtL'P 


/U4/ 


1 AA 
100 


847 


AF107S77 


nuuiij SapicIlS 


IN lemann-r iCK k^j proiem^ jni^Uj 


j4/Z 


1 AA 

100 


848 


X60489 


Homo sapiens 


elongation fector-l-beta 


1162 


100 






Homo sapiens 




2277 


67 


OJW 




Homo sapiens 


KriooJU__l 


2401 


100 


851 


AL121583 


Homo sapiens 


bA358N2.1 (novel protein) 


353 


61 




Z.454/D 


Homo sapiens 


glucokinase regulator 


3155 


99 


ODD 


A«53544 


Homo s^iens 


aJ37£l6^ (SH3-aomam bmoing 
protein 1) 


1884 


98 


854 


AF233323 


Homo sapiens 


Fas-associated phosphatase-! 


390 


36 


occ 
ODD 


AF0o2741 


Rattus 
norvegicus 


pyruvate dehydrogenase phosphatase 
isoenzyme 2 


447 


80 


OJD 


VI 1/111 

I 11411 


Homo sapiens 


pristanoyl-CoA oxidase 


3595 


98 


OJ / 


KirQ71 QQ 


Strongylocentr 
otus 

purpuraxus 


tektin Al 


290 


46 


oDo 


AtJUUl lUj 


Homo sapiens . 


-— p- — . ^ 

hippocalcin-like protein 4. 


995 


100 




Arlo4/yi 


Homo sapiens 


putative 38. 3kDa protein 


1795 


100 


oOU 


ArZyol 1 / 


Homo sapiens 


_ homeobpx protein pTX2 . » 


1477 


93 


861 . 


Af015264 : 


• Rattus 
norvegicus , 


,.golgi^p,eripherakmembrane,protein:. ; 

p65" 


1820- 


81 . 




Aloyui 


Homo sapiens 


30Kb suounit ot RAB30 /74 


1284 


100 


863 


M12140 


Homo sapiens 


envelope protein 


202 


81 


o04 


At 161459 


Homo sapiens 


HSPC109 


815 


98 




AL 109983 


Homo sapiens 
_ 


J ^ 0 1% 1111/" 11 TT 

d J7 1 8? 1 1 , 1 . 1 (novel class n 
aminotransferase similar to serine 
palmotyltransferase (isofonn 1)) 


444 


100 


500 


jVl / / 1 5 J 


Rattus 
norvegicus 


alpha- 1 -macroglobulin 


227 


45 






numo Sapiens 


gepnynn 


■370^ 
J /OJ 


1 AA 

100 


868 


X75285 


Mus musculus 


fibuIin-2 


3258 


87 




Y87Z10A 
A6Z4y4 


riomo sapiens 


tiuUim'-z 


3407 


99 






ivLus muscuius 


torsinB protein 


169 


43 


871 


AJ278313 


Homo sapiens 


phospholipase C-beta-la 


6258 


99 


Oil, 


A17n7'3'3/l/1 


Homo sapiens 


ubiquitin-specific protease 3 


256 


43 


873 


Y91955 


Homo sapiens 


Human cytoskeleton associated 
protem 1 0 (C Y oKr- 1 0^. 


535 


100 


o /*f 




— — : 

Homo sapiens 


Cdc42-interacting protein 4 


1136 


53 


875 


AF265555 


Homo sapiens 


ubiquitin-conjugating BIR-domain 
enzyme APOLLON 


627 


100 


• Q7^C 
5 /O 


i4oJoO 


Homo sapiens 


Human breast tumour-associated 
protein 47. 


2537 


98 


877 


AFl 871QR 




uiicrscCQn z long laoiorxn 


5 /04 


99 


878 


L17308 


Gossypium 
hirsutum 


proline-rich cell wall protein 


192 


35 


879 


AF177169 


Homo sapiens 


tropomoduiin 2 


1769 


100 


880 


W03627 


Homo sapiens 


Human follicle stimulating hormone 
GPR N-tenninal sequence. 


210 


23 



148 
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SEQ 
ID 
NO: 



881 
882 



883 



885 



ACCESSION 
NUMBER 



AL021068 



ACQ05498 
AF165518 



SPECIES 



Homo sapiens 



Homo sapiens 



Homo sapiens 



U13045 



886 
887 



XS2836 
X51466 



Homo sapiens 



OKSCRIPTION 



ciJ2Q6DlS.3 
R31665 2 



MAGOHisoform 



Homo sapiens 



Homo sapiens 



protem tyrosine phosphatase (PTP- 
BAS, type 3) 



nuclear respiratory factor-2 subunit 
betal 



tryptophan hydroxylase (AA 1 - 444) 



SMITH- 
WATERMAN 
SCORE 



2615 



318 



182 



368 



869 



2320 



% 

IDENTITY 



99 



82 



94 



43 



62 



98 

100 



888 

^889" 
890 



891 



892 



893 
894 



AB039903 

X51760 
AJ243396 



Homo sapiens 



Homo sapiens 

Homo sapiens 
Homo sapiens 



elongation factor 2 



W67928 



Homo sapiens 



AB020598 



Y66648 



Homo sapiens 



Homo s^iens 



interferon-responsive finger protem 1 
long form 

zinc finger protein (583 AA) 
voltage-gated sodium channel beta-3' 
subunit 



4460 



1096 

3l3r 
1024 



Fragment of human secreted protein 
encoded by gene 4. 



391 



peptide transporter 3 

Membrane-bound protein PROl 1207 



3017 



4722 



98 

Tor 

100 



100 



100 



99 

96 



895 



896 



897 
898 



899 



900 



A29218__cd 
1 



Homo sapiens 



Homo sapiens 



Membrane-bound protem PROl 120 



AJ000332 



X98259 



Homo sapiens 



19-NOV-1998 DNA encoding G- 
protein coupled 7 TM receptor with 
AXORl 5 activity. 



3606 



2178 



X57110 



Homo sapiens 



Glucosidase II 



X63652 



Homo sapiens 



M-phase phosphoprotein 8 



Homo s^iens 



c-cbl protein 



inter-alpha-trypsin inhibitor heavy 



901 



902 



903- 
904 



X85134 



LI 1672 



Homo sapiens 



chain ITIHl 



Y85565 



X54S11 
Z98265 



Homo sapiens 



Homo sapiens 



Homo sapiens 
Homo sapiens 



RB protein binding protein 



zinc finger protein 
Human homologue of UNC-53 (Hs 
UNe.53/2) sequence. 



ras related protein Rab5b" 



5063 



1085 



4849 



3376 



2816 



2047 



369 



1094 



100 



99 



100 



99 



98 



99 



58 



83 



100 
100 



905 



AL035295 



906 



AF051782 



Homo sapiens 



plakophilin 3 



907 
908 



AF208536 
U79240 



909 
910 



U79240 



Homo sapiens 



hypotiietical protein 



Homo sapiens 



diaphanous 1 



Homo sapiens 
Homo sapiens 



nucleotide binding protein; NBP 



4065 



959 



O I "^^^y ^ 

serine/threonine protein kmase 



serine/tiireonine protein kmase 



801 



1372 



2365 
2386 



99 



35 



100 



98 



99 
100 



911 
912 



AJ132545 
AL121733 



Homo sapiens 



Homo sapiens 



protein kinase 



913 



Y67579 



Homo sapiens 



protein kinase 



2921 



Homo sapiens 



hypothetical protein 



1637 



914^ 
915 



X87342 
X87342 



Homo sapiens 
Homo sapiens 



Human death inducer-obliterator 1 
(DIO-1) polypeptide. 



Human giant larvae homologue 



1344 



1586 



5317 



99 



99 



100 



99 
96 



916 
917 



M94362 



AJ011654 



918 



AJ131899 



919 
920 



AF054986 
U95822 



Homo sapiens 



Human giant larvae homologue 



lamin B2 



Homo sapiens 



Rattus 
norvegicus 



triple LIM domain protein 



_92j_ 
922 



Y11588 
X84195 



Homo sapiens 
Homo sapiens 



proline rich synapse associated 
protein 1 



Homo sapiens 
Homo sapiens^ 



putative transmembrane GTPase 



putative transmembrane GTPase 



apoptosis specific protein 



3495 



2357 



3432 



5776 



1816 



1237 



1492 



93 



100 



88 



100 



100 



923 



U72882 



924 



AE000660 



925 



AF126245 



Homo sapiens 



acylphosphatase 
interferon-induced leucine zipper 



510 



Homo sapiens 



protem 



1409 



Homo sapiens 



hADV36Sl 

acyl-Coenzyme A dehydrogenase-8 



573 



2162 



precursor 



100 



99 



100 



100 



149 
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SEQ 
ID 

NO: 


ACCESSION 

NUMBER 


SPECIES 


DESCRIPTION 


SMITU- 
WATERMAN 
SCORE 


% 

IDENTTTY 


92o 


AE001968 


Deinococcus 
radiodurans 


hypothetical protein 


147 


27 


927 


W81576 


Homo, sapiens 


EBV-induced G-protein coupled 
receptor (EBI-2) polypeptide. 


1778 


100 


928 


U01317 


Homo sapiens 


beta-globin 


687 


94 


929 


X98333 


Homo sapiens 


organic cation transporter 


2933 


100 


930 


Y91444 


Homo sapiens 


Human secreted protein sequence 
encoded by gaie 42 SEQ ID 
NO: 165. 


1401 


100 


931 


V A 1 ^ A A 

Y91644 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 43 SEQ ID 
NO:317. 


1243 


100 


932 


D90279. 


Homo sapiens 


collagen a^ha 1(V) chain precursor 


569 


39 


933 


Z31560 


Homo sapiens 


sox-2 


1587 


96 


934 


AF147790 


.Homo sapiens 


transmembrane mucin 12 


3047 


99 


935 


Z85996 


Homo sapiens 


match: multiple proteins; match: 
Q08151 P28185 QOlll 1 Q43554; 
match: Q08150 Q40195 P20340 
Q39222; match: Q40368P36412 
P40393 Q40723; match: CE01798 
Q38923 Q40191 Q41022; match: 
Q39433 Q40177 Q40218 Q08146; 
match: P10949 PI 1023 Q16948 
Q20337; match: Q25389 P25228 
P20336 P05713; match: P35276 
Q08147 P17609 P22128; match: 
Q15771 P36410P35291;GTP- 
binding 


726 


94 

\ 


936 


AB04I533 


Homo sapiens 


sperm antigen 


1054 


38 


937 


X91906 


Homo sapiens 


voltage-gated chloride ion channel 


3914 


100 


: i:935r 


.-^03248*1 J: 


Homo sapiens 


homeobpx transcription factor, r-''"' " 


. 1744 


::l:;±100™^ 


^ 939' 


AFl]li06" 


Homo sapiens 


protein serine/threonine phosphatase 
4 regulatory subunit 1 


4682 


v 99 : y 


940 


Y 17999 


Homo sapiens 


DyrklB protein kinase 


3331 


99 


941 


AF305872 


Homo sapiens 


thyroglobulin 


455 


92 


942 


AF263462 


Homo sapiens 


cingulin 


5939 


99 


943 


AK024442 


Homo sapiens 


FLJ00032 protein 


1616 


61 


944 


Y35911 


Homo sapiens 


Extended human secreted protein 
sequence, SEQ ID NO. 160. 


262 


35 


945 


AB015320 


Homo sapiens 


sigmalB subunit of AP-1 clathrin 
adaptor complex 


599 


71 


946 


Z82287 


Caenorhabditis 
elegans 


ZK550.2 


229 


35 


947 


D84223 


Homo sapiens 


leucyl tRNA synthetase 


6207 


99 


948 


U49057 


Rattus 
norvegicus 


rA9 


3846 


62 


949 


AK000568 


Homo sapiens 


unnamed protein product 


1659 


100 


950 


AL021578 


Homo sapiens 


dJ453C12.6.1 (xmcharacterized 
hypothalamus protein (isoform 1)) 


257 


42 


951 


AB032435 


Homo sapiens 


differentiation-associated Na- 
dependent morganic phosphate 
cotransporter 


3063 


99 


952 


AFl 10532 


Homo sapiens 


uncoupling protein UCP-4 


1561 


100 




j\.OJ Jo / 


iVLus muscuius 


lAij protem 


1420 


59 


954 


AL031665 


Homo S£q)ieiis 


dJ545L17.5.1 (novel protein) 


386 


53 


955 


Y87600 


Homo sapiens 


Human fatty acid synthase-like 
protein (HFASLP). 


2377 


100 


956 


Y99421 


Homo sapiens 


Human PR01433 (UNQ738) amino 
acid sequence SEQ ID NO:292. 


522 


55 
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SEQ 
ID 
NO: 



957 
958 



PCT/USOl/04098 



ACCESSION 
NUMBER 



U68535 
AC007067 



SPECIES 



OESCRIPnON 



Mus musculus 



aldo-keto reductase 



SMITH- 
WATERMAN 
SCORE 



451 



iDENTrry 



73 
57 



959 
960 



961 



962 



964 



965 



966 



967 



968 



969 



Arabidopsis 

thaliana 



Tl 0024,10 



U72194 
AE003661 



Mus musculus 



muskelin 



X80332 



Drosophila 
melanogaster 
Mus musculus 



CG15168 gene product 



Y67315 



rab20 



Homo sapiens 



Y67315 



Homo sapiens 



Human secreted protein BL89_13 
amino acid sequence. 



L32602 



Z97832 



Rattus 
norvegicus 



Human secreted protein BL89_13 
amino acid sequence. 



homeodomain 159..341 



W88995 



Homo sapiens 



Homo sapiens 



U12465 
AF151803 



W74865 



Homo sapiens 



Homo sapiens 
Homo sapiens 



dJ329A5.3 (KIAA06460 protein) 



Polypeptide fragment encoded by 
gene 146. 



ribosomal protein L35 



CGI-45 protein 



Human secreted protein encoded by 
gene 137 clone HMWIF35. 



1594 



3947 



277 



983 



3916 



3916 



1821 



3581 



176 



604 



1101 



1348 



99 



54 



82 



99 



99 



96 
"99" 



39 



100 



78 



98 



Homo sapiens 



succinate dehydrogenase flavoprotein 
subimit 



703 



100 



Drosophila 

buzzatii 

Homo sapiens 



protease, reverse transcriptase, 
ribonuclease H, integrase 



194 



23 



N-acetylgalactosaminyltransferase; 
similar to Q10473 (PID:gl 709559) 



3271 



100 



974 



975 
976 



Ml 7885 



U22829 
AL132772 



Schizosacchar 
omyces pombe 
Homo sapiens 



DNA2-NAM7 heUcase family 
protein 



Mus musculus 



acidic ribosomal phosphoprotein (PO) 



685 



P2Y purinoceptor 



792 



399 



31 



100 



977 



978 



979 



980 
981 



Homo sapiens 



AC003973 



J04031 



Homo sapiens 



Homo sapiens 



AF136715 
Z92822 



Homo sapiens 



Homo sapiens 
Caenorhabditis 



dJ1013A22.1 (hepatic nuclear fector 
4, alpha) " - — . 



2466 



ZNF91L 



MDMCSF (EC 1.5.1.5; EC 3.5.4.9- 
EC 6.3.4.3) 



1550 



2824 



taxol resistant associated protein 



taxol resistant associated protein 
ZK520.1 — 



217 



306 



99 



43 



63 



76 



95 
44 



elegans 



1109 



983 

TIT 



AL02 133 1 
AL161501 



Homo sapiens 
Homo sapiens 



putative dipeptidase 



dJ366N23.3 (KIAA0173 and 
Tubulin-Tyrosine Ligase LIKE) 



1564 



1492 



99 



100 



Arabidopsis 
thaliana 



putative adenosine deaminase 



370 



38 



TABLES 



SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPllON 


RESULTS* 


2 


BL00282 


Kazal serine protease inhibitors feirifly 
proteins. 


BL00282 16.88 4.259e-14 97-120 ' 


3 


BL00298 


Heat shock hsp90 proteins femily 
proteins. 


BL00298A 1 0.97 1 .OOOe-40 74- " 
119 BL00298E 27.301. OOOe-40 
321-376 BL00298F 11.21 l.OOOe- 
40 409-464 BL00298H 20.50 
I. OOOe-40 553-607 BL00298C 
16.40 2.286e-40 186-230 
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SEQ 

n> 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL00298B 15.64 1.290e-39 134- 
181 BL00298G 24.57 5.345e-39 
465-520 BL002981 30.07 7.81 8e- 
34 661-715 BL00298D 17.97 
6.226e-33 242-282 


4 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237A 11.48 4.316e-13 57-82 


5 


PD02454 


!!!! PROTEIN ALU SUBFAMILY 
WARNING ENTRY NUCLEAR 
PHOSPHO. 


PD02454B 11.61 4.309e-17 75- 
103 


6 


DM00864 


EGF-LIKE DOMAIN. 


DM00864A 15.21 7.429e-0998- 
119 


7 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237A 11.48 1.750e-ll 29-54 
PR00237D 8.94 7.000e-09 138- 
•160 PR00237B 13.50 8.250e-09 
61-83 


9 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.667e-15 272r289 


10 


BL00139 


Eukaiyotic thiol (cysteine) proteases 
cysteine proteins. 


BL00139D 9.244.400e-ll 391- 
408 BL00139A 10.29 7.51 le-09 
67-77 


12 


BL01113 


Clq domain proteins. 


BL01113B 18.26 9.294e-19 689- 
725 BL01113C 13.18 4.857e-ll 
757-777 BlX)1113D7.472.161e- 
10 790-800 


13 


BL01113 


Clq domain proteins. 


BL01113B 18.26 3.813e-14 599- 
635 BL01113C 13.18 4.857e-ll 
667-687 BL01113D7.47 2.161e- 
10 700-710 


14 


BL00594 


Aromatic amino acids permeases 
.proteins...--^-, 


BL00594A 16.75 6.53 le-10 50-94 


15 


J ;;^BL0ip47 


He^^^me^lr|ss^ domairijffoteins. ' 


:BL01047B 19.73 4.913e-13 707- 
728 


. 16 


PR00625 


DNA J PROTEIN FAMILY 
SIGNATURE 


PR00625A 12.84 7.462e-18 310- 
330 PR00625B 13.483.939e-15 

. 340-361 


18 


BL00615 


C-type lectin domain proteins. 


BL00615A 16.68 3.700e-09 144- 
162 


20 


PR00741 


GLYCOSYL HYDROLASE FAMILY 
29 SIGNATURE 


PR00741D 16.11 9.082e-21 175- 
195 PR0074 IF 14.66 9.262e-21 
243-265 PR00741B 14.23 1.947e- 
18 128-145 PR00741G9.29 
2.1806-17 318-340 PR00741C 
9.16 7.328e-17 147-166 
PR00741H 10.32 2.141e-13 351- 
374 PR00741A9243.596e-13 
89-105 PR00741E 13.39 3.535e- 
12 215-232 


22 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 3.647e-20 117- 
148 BL00107B 13.31 l.OOOe-16 
182-198 


23 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 1.600e-23 126- 
157 


24 


BL00107 


Protein kiaases ATP-binding region 
proteins. 


BL00107A 18.39 1.600e-23 126- 
157 


27 


BL00239 


Receptor tyrosine kinase class n proteins. 


BL00239B 25.15 2.324e-16 91- 
139 


28 


BL00018 


EF-hand calcium-binding domain 
proteins. 


BL00018 7.41 3.250e-10 681-694 
BL00018 7.41 6.400e-10 717-730 


29 


BL00018 


EF-hand calcium-binding domain 


BL00018 7.41 3.250e-10 681-694 



152 



wo 01/57190 



PCT/USOl/04098 



SEQ 
ID 

NO: 



ACCESSION 
NO. 



DESCRIPTION 



proteins. 



RESULTS* 



BL00018 7.41 6.400e-l0 717^730^ 



30 
33 



.BL01113 



PD01168 



Clq domain proteins. 



34 



SYNTHETASE LIGASE PROTEIN 
ALANYL. 



BLOll 13A 17.99 9.308e>09 54-81 



PD01168 



36 



SYNTHETASE LIGASE PROTEIN 
ALANYL. 



PDOl 168L 9.47 1 .667e-09 401- 
416 



PR00426 



37 
"IT 



C5A-ANAPHYLAT0XIN RECEPTOR 
SIGNATURE 



PF00791 



BL00350 



40 



BL00123 



Domain present in ZO-1 and Unc5-Iike 
netrin receptors. 



MADS-box domain proteins. 



Alkaline phospiiatase proteins. 



PD01168L 9.4V i.667e-09411- 

426 



PR00426D 10.59 3.618e-12 110- " 
122 



PF00791B 28.49 2.049e-10 1080- 

1135 



BL00350 20.79 l.OOOe-40 1-55 



44 



PD00066 



PROTEIN ZINC-FINGER METAL- 
BINDI. 



BL00123B 19.31 l.OOOe-40 90- 
133 BL00123C 24.61 l.OOOe-40 
145-195 BL00123E 22.25 I.OOOe- 
40 304-358 BL00123G 26.01 
l.OOOe-40 438-488 BL00123F 
19.03 8.714e-35 364-399 
BL00123A 10.80 9.000e-24 52-77 
BL00123D 12.73 l.OOOe-17216. 
229 



PD00066 13.92 2.800e-14 346-359 
PD00066 13.92 4.6006-14 486-499 
PD00066 13.92 l.OOOe-13 374-387 
PD00066 13.92 6,000e-13 458-471 
PD00066 13.92 2.714e-12 234-247 
PD00066 13.92 3. 143e-12 430-443 
PD00066 13.92 8.714e-12 514-527 
PD00066 13.92 3.739e-l 1 402-415 
PD00066 13.92 2.038e-10 3 18-331 



DM00973 



47 



BL00649 



3 kw RESISTANCE BENOMYL" 
YLL028 W CYGLOHEXIMIDE. 



G-protein coupled receptors family 2 
proteins. 



DM00973A21.172.946e-10 180- 
217 



50 



PD00066 



PROTEIN ZINC-FINGER METAL- 
BINDI. 



BL00649C 17.82 1.682e-10475-" 
501 BL00649B 20.68 7.387e.09 
417-463 



51 



BL00226 



Intermediate filaments proteins. 



PD00066 13.92 8.200e-16 445-458 
PD00066 13.92 5.846e-15 305-318 
PD00066 13.92 LOOOe- 14 221-234 
PD00066 13.92 l.OOOe- 14 417-430 
PD00066 13.92 2.800e-14 249-262 
PD00066 13.92 2.800e.l4 277-290 
PD00066 13.92 8.800e-14 333-346 
PD00066 13.92 9.400e-14 361-374 
PD00066 13.92 4.0006-13 389-402 
PD00066 13.92 6.571e-12 473-486 



BL00226D 19.10 l.OOOe-40417- 
464 BL00226B 23.86 3.348e-35 
251-299 BL00226C 13.23 1.429e- 
24 316-347 BL00226A 12.77 
1.857e-15 151-166 



43 KD POSTSYNAPTIC PROTEIN 
SIGNATURE 

Cadherins extracellular repeat proteins 



PR00217C 10.91 5.648e-09 133- 
149' 



domain proteins. 



BL00303 



S-lOO/ICaBP type calcium bindmT 



BL00232B 32.79 l.OOOe-40 143-"" 
191 BL00232A 27.72 2.350e-28 
49-82 BL00232B 32.79 7.052e-21 
252-300 BL00232C 10.65 6.625e- 
20 250-268 BL00232B 32.79 
1.3 14e-l 1367-415 BL00232C 
10.65 9308e'10470-4BS 
BL00303B 26.15 8.759e-23 125- 
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SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






protein. 


162 BL00303A 21.77 l.OOOe-21 
82-119 


58 


PR00378 


INOSITOL PHOSPHATASE 
SIGNATURE 


PR00378D 16.86 l.OOOe-15 242- 
261 PR00378B 13.80 9.250e-13 
109-129 


59 


PR00425 


BRADYKININ RECEPTOR 
SIGNATURE 


PR00425C 13J23 9.040e-12 120- 
140 


60 


BL00280 


Pancreatic trypsin inhibitor (Kimitz) 
family proteins. 


BL00280 24.61 6.727e-38 238-282 
BL00280 24.61 1.5 14e-30 294-338 


65 


BL01019 


ADP-ribosylation factors family proteins. 


BL01019A 13.20 1.222e-ll 43-83 


68 


PR00237 

• 


RHODOPSIN-LIKE GPCR 
SUPERFAMDLY SIGNATURE 


PR00237E 13.03 5.091e-13 188- 

212 PR00237G 19.63 7.207e-13 
268-295 PR00237A 11.48 4.375e- 
11 24-49 PR00237C 15.69 
3.057e-10 101-124 PR00237D 
8.94 4.750e-10 137-159 
PR00237F 13.57 5.364e-10230- 
255 PR00237B 13.50 9.438e-10 
57-79 


70 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.938e-28 31-70 


71 


PR00830 


ENDOPEPTIDASE LA (LON) SERINE 
PROTEASE (S16) SIGNATURE 


PR00830A8.41 8.759e-12 348- 
368 


72 


6L00120 


Lipases, serine proteins. 


BL00120B 11372.149e-10 148- 

163 


11 


PR00753 


1 -AME^OCYCLOPROPANE- 1 - 
CARBOXYLATE SYNTHASE 
SIGNATURE 


PR00753E 8.01 3,552e-ll 191- 
216 PR00753D6.85 2.778e-09 . 
131-153 - 


78 


PR00506 


D21 CLASS N6ADENINE-SPECIFIC 
DNA METHYLTRANSFERASE 
: SIGNATURE- «^ 


PR00506C 19.40 8.017e-09 96- 

119 . ... . . 

. . ^ - -'"^^ r^^y^m^^" 


82 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 3.571e-16 436- 
467 


84 


BL00675 


Sigma-54 interaction domain proteins 
ATP-binding region A proteins. 


BL00675A 24.86 8.800e-10 256- 
300 


85 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 2.286e-30 117-160 


87 


BL00250 


TGF-beta family proteins. 


BL00250A 21.24 6.786e-36 264- 
300 BL00250B 27.37 L450e-26 
328-364 


91 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 9.250e-17 10-35 
BL00215A 15.82 6.000e-16 221- 
246 BL00215A 15.82 7.857e-12 
108-133 BL00215B 10.44 9.526e- 
11 168-181 


92 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 9.526e-24 324-367 


95 


PR00094 


ADENYLATE KINASE SIGNATURE 


PR00094C 12.94 l.OOOe-08 119- 
136 


96 


PD02327 


GLYCOPROTEIN ANTIGEN 
PRECURSOR IMMUNOGLO. 


PD02327B 19.84 2.091e-09 143- 
165 


97 


BL00752 


XPA protem. 


BL00752B 19.17 7.309e-09 28-72 


98 


PR00876 


NEMATODE METALLOTHIONEIN 
SIGNATURE 


PR00876B 7.66 2.268e-10 135- 
149 


99 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 9.824e-12 122- 
141 


100 


BL00027 


Tlomeobox' domain proteins. 


BL00027 26.43 7.429e-31 118-161 


101 


BL00028 


Zinc finger, C2H2 type, domain proteins. 


BL00028 16.07 6.870e-12 370-387 
BL00028 16.07 6.885e-n 398-415 
BL00028 16.07 8269e-ll 342-359 
BL00028 16.074.3000-10229-246 
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S£Q 

m 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


102 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


BL00028 16.07 6. lOOe- 10 258-275 
PR00048A 10.62 7.750e.l4 665- 
679 PR00048A 10.52 8.500e-14 
581-595 PR00048A 10.52 9.250e- 
14637-651 PR00048A 10.52 
2.0596-12609-623 PR00048A 
10.52 2.588e-12 469^83 
PR00048A 10.52 7.353e-12 553- 
567 PR00048A 10.52 2.895e-ll 
525-539 PR00048A 10.52 4.316e. 
11 44M55 PR00048A 10.52 
5.263e-l 1413-427 PR00048B 
6.02 2.125e-I0 569-579 
PR00048B 6.02 4.938e-10 513- 
523 PR00048A 10.52 5.$96e- 10 
497-51 1 PR00048B 6.02 8.875e- 
10 429-439 PR00048B6.02 
l.OOOe-09 457-467 PR00048B 
6.02 6.684e-09 485-495 


103 


PR00195 


DYNAMIN SIGNATURE 


PR00195A 1 1 .94 5.364e-22 3 1-50 
PR00195B 9.47 1.783e-21 56-74 
PR00195C 11.50 3.455e-21 126- 
144 PR00195D 11.76 8.714e-21 
175-194 PR00195F 16.20 8.500e- 
20217-237 PR00195E9.82 
8.650e-20 194-211 


104 


BL0ni3 


C 1 q domain proteins. 


BL0I113A 17.99 1.865e-09 121- " 
148 BL01113A 17.99 5.846e-09 
82-109 


105 


BL00420 


Speract receptor repeat proteins domain 


BL00420A 20.42 6.400e-ll 70-99 
BL00420A 20.42 8.525e-10 73- 
102 BL00420A 20.42 5.708e-09 
85-114 


108 


PR00860 


VERTEBRATE METALLOTHIONEIN 
SIGNATURE 


PR00860B 7.04 2.929e-20 27-41 
PR00860A 5.46 5.500e-16 5-18 
PR00860C9.61 1.474e- 14 41-51 


112 


BL01031 


Heat shock hsp20 proteins family profile!" 


BL01031C 17.68 6.400e-10 122- " 
147 


114 


DM01840 


kw SPAC24B11.09 R07E5.13. 


DM01 840B 22.04 2.688e-40 59- " 
103 DM01S4QA 10 95 Q 57lft-i^ 
31-43 


115 
116 


BL01126 
BL00216 


Elongation fector Ts proteins. ~" 
Sugar transport proteins. 


BL01126A 18.482.3176-30 46-89 " 
BL01126B 13.15 7.387e-I9 116- 
135 BL01I26C9.20 9.735e-ll 
190-203 


118 


BL00437 


Catalase proximal heme-Iigand jroteins. 


BL00216B 27.64 4.375e-21 35-85 
BL00437A 18.82 l.OOOe-4049- 
101 BL00437B 16.28 l,000e-40 
114-168 BL00437C 21.86 l.OOOe- 
40 190-239 BL00437D 25.72 
l.OOOe-40 248-301 BL00437E 
23.95 l.OOOe-40 327-379 


119 
~T20 


BL0Q140 

j 


Jbiquitin carboxyl-tenninaJ hydrolase 
family 1 cysteine activ. ; 


BL0OI40D 22.64 8 274e.I4 164- 
208 BL00140C 11 80 5 444e-10 
77-102 


122 
123 


BL00224 ( 

BL00203 y 
PR00041 ( 


-lathrin light chain proteins. J 

/ertebrate metallothioneins proteins. I 
:AMP RESPONSE ELEMENT ] 


3L00224B 16.94 6.712e-10 95- " 
148 

3L00203 13.94 1 .OOOe-40 1 6-62 " 
'R00041D 7.95 2.906e-09 24-41 
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SEQ 
ID 

NU: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






BINDING (CREB) PROTEIN 




124 


PR00041 


CAMP RESPONSE ELEMENT 
BLNDINCj (CKJbB) PROTEIN 
SIGNATURE 


PR00041D 7.95 2.906e-09 24-41 


125 


BL00061 


Short-chain dehydrogenases/reductases 
family proteins. 


BL00061C 7.86 3.250e-10212- 
222 


126 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FTNGER METAL-BINDING NU. 


PD01066 19.43 6.400e-25 251-290 


127 


PR00318 


ALPHA G-PROTEIN (TRANSDUCIN) 
SIGNATURE 


PR00318D 16.28 L900e-34 219r 
248 PR00318B 14.79 3.455e-27 
168-191 PR00318C 12.09 7.000e- 
23 197-215 PR00318A7.84 
1 .600e- 19 35-51 PR003 1 8E 7.23 
2.5G0e- 12 265-275 


128 


PR00927 


ADENINE NUCLEOTIDE 
TEUNSLOCATOR 1 SIGNATURE 


PR00927E 14,93 9.743e-10 67-89 
PR00927B 14.66 4.575e-09 69-91 


130 


BL00824 


Elongation &ctor 1 beta/beta'/delta chain 
proteins. 


BL00824B 9.21 7.750e-22 133- 

153 


131 


BL00824 


Elongation factor 1 beta^eta'/delta chain 
proteins. 


BL00824C 14.58 l.OOOe-40 166- 
204 BL00824D 14.04 1.621e-38 
204-239 BL00824B 9.21 7.750e- 
22 133-153 BL00824E 12.49 
l.OOOe-19 247-263 


132 


PR00209 


ALPHA/BETA GLL^DIN FAMILY' 
SIGNATURE 


PR00209B4.88 9.222e-13 1209- 
1228 


133 


PR00209 


ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 


PR00209B 4.88 9.222e-13 1 168- 
1187 


134 


PR00708 

J" ^ . . ■ • . t* 


ALPHA- 1 -ACID GLYCOPROTEIN 
SIGNATURE 


PR00708D 14.67 l.OOOe-27 141- 
J68 PR00708C11.771.643e-25 
:98-:120 PR00708B 15.15 2.174e- 
2473-95 PR00708E 13.33 
1.600e-21 189-207 PR00708A 
14.40 2.6366-2151-70 


135 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 8.468e-13 126- 
145 


136 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 3.250e-10 201- 
217 


137 


BL00471 


Small cytokines (intercrine/chemokine) 
C-x-C subfamily signat. 


BL00471 23.92 7.480e-10 42-90 


140 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 5.582e-10 328- 
346 PR00205B 1 1.39 9.01 8e- 10 
543-561 - 


141 


BL00412 


Neuromodulin (GAP-43) proteins. 


BL00412D 16.54 7.704e-09 976- 
1027 


143 


PR00979 


TAFAZZIN SIGNATURE 


PR00979E 10.83 5.950e-26 192- 
214 PR00979A 11.91 8.773e-25 
63-83 PR00979C 12.16 6.400e-19 
108-124 PR00979D 12.38 7.955e- 
19 170-185 PR00979F 10.14 
3.382e-15 230-244 PR00979B 
15.59 5.636e-15 94-106 


145 


DM00686 


kw REPUCATION REP 28K 17.7K. 


DM00686C 14.14 7.720e-09 111- 
131 


146 


PR00604 


CLASS lA AND IB CYTOCHROME C 
SIGNATURE 


PR00604D 15.86 l.OOOe-17 87- 
104 PR00604B 12.73 9,591 e- 16 
57-73 PR00604C 10.21 8.200e-12 
73-84 PR00604E 10.13 l.OOOe-U 
106-117 PR00604A 11.13 8.800e- 
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11 44-52 PR00604F 8.60 l.OOOe- " 
10 123-132 


147 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 3.864e-15 266- 
297 BL00107B 13.31 6.143e.ll 
335-351 


148 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 8.448e-09 67-81 — 


149 
150 


PR00069 
BL00027 


ALDO-KETO REDUCTASE 
SIGNATURE 

'Homeobox' domain proteins. 


PR00069D IQ 1 k^if^ -i/i icT 
217 PR00069A 16.01 7.429e-25 
41-66 PR00069E 18.14 3.100e-22 
235-260 PR00069C 16.03 7.000e- 
20 151-169 PR00069B 11.33 
8.071e-19 101-120 


151 


PD02906 


SYNTHASE I PSEUDOURIDYLATE 
PSEUDOURIDINE LYASE TR. 


BL00027 26,43 2,688e-27 139-182 

PD02906C 24.17 7.070e-22 165- " 
200 PD02QO/TR 1 ^ » QO^a i « 

114-127 PD02906A 10.84 6.500e- 
09 71-84 


153 
158 


BT .00470 
BL00027 


raorbol esters / diacylglycerol binding 
domain proteins. 

'Homeobox' domain proteins. 


BL00479A 19.86 5,091e-12 891- 
914 BL00479B 12.57 1.837e-ll 
915-931 


160 




1 Granins proteins. 


BL00027 26.43 6.786e-31 143-186 
BL00422C 16.18 7.750e-12 420- 
448 


162 


PR00625 


DNA J PROTEIN FAMILY 

MuNAiURE 


PR00625A 12.84 9.297e-ll 62-82 " 


164 


BL01282 


BIR repeat proteins. 


BL01282B 30.49 6.182e-10 347- 

386 


166 


PR00860 


VERTEBRATE METALLOTHIONEIN 
SIGNATURE 


PR00860B 7.04 2.929e-20 83-97 
PR00860A 5.46 l.OOOe-18 61-74 
PR00860C 9.61 1.9006-15 97-107 


167 


PR00449 


TRANSFORMING PROTEIN P21 RAS 
SIGNATURE 


PR00449A 13.20 7.052e-09 196- 
218 


169 


BL00514 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514C 17.41 1.346e-39 316- 
353 BL00514G 15.98 2.241e-34 
471-501 BL00514H 14.95 6.571e- 
27 510-535 BL00514E 14.28 
1.2736-16 388-405 BL00514D 
15.35 9.100e-15 369-382 
BL00514B 16.42 4.857e-14 260- 
276 BL00514F 1 1.65 9.690e-14 
416-431 BL00514A 11.68 8.200e- 
11 149-159 


170 


BL00514 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514C 17.41 1.346e.39268- 
305 BL00514G 15.98 2.241e-34 
423-453 BL00514H 14.95 6.571e- 
27462-487 BL00514E 14.28 
1.273e-16 340-357 BL00514D 
15.35 9.100e-15 321-334 
BL00514B 16.42 4.857e-14 212- 
228 BL00514F 11.65 9.690e-14 
368-383 BL00514A 11.68 8.200e- 
11 101-111 


171 


BL00514 ] 

1 


fibrinogen beta and gamma chains C- 
terminal domain proteins. * 

] 


djluujI^vj ID.yo /.z41e-34 3o5- 
ns BL00514H 14.95 6.571e-27 
m-449 BL00514C 17.41 4.632e- 
M 230-267 BL00514E 14.28 
l.273e-16 302-319 BL00514D 
15.35 9.1006-15 283-296 
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BL00514B 16.42 4.8576-14212- 
228 BL00514F 1 1.65 9.690e-14 

49A ^ >l C nT AACI il A 11 n 

330-345 BL00514A 11.68 8.200e- 
11 101-111 




RT rtftn07 


'Homeobox' domain protsins. 


BL0U027 26.43 9.400e-29 1 19-162 


174 


DM01970 


0kwZK632.12YDR313C 

■CT^TT^/^ 0/^11 MAT TTT 

hNxJUoUMAL ill. 


DM01970B 8.60 5.1196-15 1391- 
1404 


176 


BL00773 


Chitinases family 1 9 proteins. 


BL00773C 9.42 8.000e-092-16 


loz 


T>n t\f\ 1 AO 


I YKUSUNJb JKJUNAyJb (JATALY liC 
DOMAIN SIGNATURE 


PR00109B 1227 9.1636-14 141- 
160 


lo3 


rUU 1 937 


UJNA FRU'l'lilN PULYMfiRASE 
ENDONUCLEASE DNA-. 


PD01937A 6.68 3.475e-09 221- 
232 


1 0< 


BL00S45 


CAP-Gly domain proteins. 


BL00845 16.43 2.946e-23 247-272 
BL00845 16.43 1.628e-21 107-132 


ioO 


T-\T> r\r\ A CO 

rK0u452 


8113 DOMAIN SIGNATURE 


PR00452B 1 1.65 6,538e-l 1 525- 
541 


lo7 


PR00452 


8113 DOMAIN SIGNATURE 


PR00452B 1 1.65 6.538e-l 1 497- 
513 


188 


DM01803 


1 HERPESVIRUS GLYCOPROTEIN H. 


DM0 1803 A 10.51 l.OOOe-09 
1081-1102 


189 


PF00651 


BTB (ako known as BR-C/Ttk) domain 
proteins. 


PF00651 15.00 5.091e-15 69-82 


190 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194C6.38 L900e-35 145- 
174 PR00194E 8.74 3.2506-30 
231-257 PR00194D9.57 1.500e- 
26 175-199 PR00194B 10.24 
5.200e-24 120-141 PR06l94A 
7.86 4.8576-21 84-102 


192 


PD02042 


IRON-SULFUR ELECTRON 
TRANSPORT. AROMATIC 
HYDROCARB. ... : 


PD02042B 16.75 5.154e-09 131- 
146 PD02042A21.13.5.909e-09 . 
94-121 1„ Tl- 


193 


PRO 002 1 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


PR00021A 4.3 1 2.200e-10 2-15 


1 AC 

195 


BL00463 


Fungal Zn(2)-Cys(6) bmuclear cluster 
domain proteins. 


BL00463 8.22 5.071e-09 111-123 


196 


nn A A 1 1 o 

PR00118 


BETA-LACTAMASE CLASS A 
SIGNATURE 


PR00118F 16.42 9.386e-09 165- 
181 


197 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 5.424e-09 234- 

267 


198 


BL00660 


Band 4. 1 family domain proteins. 


BL00660A 31.50 5.500e-l 1 714- 
767 


199 


BL00282 


Kazal serine protease inhibitors family 
proteins. 


BL00282 16.88 8.820e-13 70-93 


202 


AAAAA 

PR00009 


TYPE I EGF SIGNATURE 


PR00009A 14.15 5.345e-15 971- 
987 PR00009C 14.11 8.773e-13 
996-1008 PR00009D 16.83 • 
8.000e-ll 1008-1018 PR00009C 
14.11 1.8826-09892-904 


203 


BL00025 


P-type 'Trefoil' domain proteins. 


BL00025 17.17 4.536e-19 38-59 


2uD 


BLOCK) 18 


EF-hand calcium-binding domain 
proteins. 


BL00018 7.41 7.300e-10 165-178 


206 


. PR00168 


SLOW VOLTAGE-GATED 
POTASSIUM CHANNEL SIGNATURE 


PR00168D 12.88 6.8656-11 67-86 


207 




^•t\/T>P 'T'l'^fAtl* H Amain m*rk4-ot-ne 

jr-Qrpc ireiuu uuiiiaui proicms. 


■QT AAAOC IT IT** >IT3~ '^A ne\ ^A 

I5LUUU25 17.17 5A13^\L^ 39-60 
BL00025 17.17 8.750e-16 88-109 


209 


BL00646 


Ribpsomal protein S13 proteins. 


BL00646B 21.42 6.100e-30 110- 
143 BL00646A 25.82 6.192e-29 

14-62 


210 


PR00138 


MATRDON SIGNATURE 


PR00138D 16.56 3.6056-25 279- 
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305 PR00138C 16.41 3.000e-24 
218-247 PR00138E6.018.714e- 
13 314-328 PR00138A 15.14 
9.538e-13 134-148 PR00138B 
15.82 4.5226-12 188-204 


211 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 8.429e-12386- 
406 DM01206B 10.69 1.247e-10 
384-404 DM01206B 10.69 
5.068e-10 388-408 


212 
213 


PD01941 

BL00362 


TRANSMEMBRANE 
COTRANSPORTER SYMP. 

Kibosomal protein S15 proteins. 


PD01941A 14.81 l.OOOe-40 163- " 
217 PD01941B 15.02 9.705e-30 
420-467 Pr)ni041P 1^ 09 Q Ti/i^ 

23 837-884 PD01941C 19.96 
8.200e-20 508-563 PD01941D 
27.18 1.600e-16 661-710 
PD01941F 28.52 9.645e-15 1005- 
1060 


214 


BL00115 


x:»uKaryonc KIM A polymerase n 
heptapeptide repeat proteins. 


BL00362 24.67 8.313e-09 330-373 
BL00115Z3.12 2.125e-09 1178- 
1227 BLOOl 15Z 3.12 6.096e-09 
1164-1213 


215 


RT ftftfi^5l 


Myc-type, *helix-Joop-helix' dimerization 
domain proteins. 


BL00038B 16.97 7.600e-18 125- 
146 BL00038A 13.61 1.474e^l3 
102-118 


216 


BL01108 


Ribosoma! protein L24 proteins. 


BLOl 108A 20.33 2.241e-22 49-82 
BLOl 108B 1 1.40 8.457e-10 96- 
107 


217 


PR00381 


KJNESIN LIGHT CHAIN SIGNATURE 


PR00381A9.55 1.321e-10360- — 
378 


222 


BL00514 


Fibrinogen beta and gamma chains C- 
terminal domam proteins. . 


BL00514C 17.41 2.358e-26 1 166- 
1203 BL00514G 15.98 9.000e-15 
1289-1319 BL00514D 15.35 
6.936e-12 1207-1220 BL00514F 
1 1.65 4.288e-10 1253-1268 
BL00514H 14.95 8.636e-I0 1318- 
1343 


223 


BL00325 


Actin-depolymerizing proteins. 


BL00325B 21.66 l.OOOe-40 93- 
139 BL00325A 24.83 9.3336-24 
61-93 


224 

225 


BL00018 
PF01329 


EF-hand calcium-binding domain 

pi xjldno. 

Pterin 4 alpha carbinolamine dhydratase. 


BL00018 7.41 1.450e-10 231-244 


228 
~230 


BL00211 


-fVDv< iRtasponers lamiiy protems. 


PF01329B 18.52 1.692e-18 67-92 
BL00211B 13.37 6.250e-18 1033- 
1065 BL00211B 13.37 8.875e-18 
2045-2077 BL00211A 12.23 
1.900e-09 93 1-943 




PR00761 


BINDIN PRECURSOR SIGNATURE 


PR00761A5.81 9.366e-09 275- 
292 


231 


PR00049 


WILM'S TUMOUR PROTEIN 


PR00049D 0.00 3.500e-i0 54-69 


232 


BL00412 


Nem-omodulin (UAP-43) proteins. 


BL00412D 16.54 1.978e-10 109- 
160 BL00412D 16.54 4. 122e-09 
133-184 


233 


BL01210 { 


L^aveoiins protems. j 


BL01210B 13 92 8 12Qe-00 IHA 
156 


236 
~238 


BL00939 ] 


<i bosomal protein L 1 e proteins. ] 

I 


BL00939F 17.27 5.393e-09 861- 

591 




BL01252 I 


inaogenous opioids neuropeptides 1 
)recursors proteins. ^ 


3L01252D 18.25 3.571e-28 205- 
133 BL01252B 19.09 5.034e-27 
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37-67 BL01252C 18.10 1.621e-21 
164-190 BL01252A 14.22 7.107e- 
18 14-34 


239 


BL00302 


Eukaryotic initiation factor 5A hypusine 
proteins. 


BL00302 14.81 l.OOOe-40 25-79 


240 


PR00420 


AROMATIC-RING HYDROXYLASE 

/T7T A \ Tfxrya /^ttjtvt 

[t LA VUr RO 1 JalN 
MONOOXYGENASE) SIGNATURE 


PR00420A 14.78 8.851e-13 26-49 


241 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 4.529e-09 235- 
289 


243 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 8.527e-25 11-50 


244 


BL01270 


Band 7 protein family proteins. 


BL01270C 16.91 6.745e-17 115- 
144 BL01270B 18.74 6.857e-17 
76-115 BL01270E 13.03 6.016e- 
15 182-211 BL01270D 20.87 
9.160e-13 144-182 


245 


PF00791 


Domain present in ZO-1 and Unc5-like 
netrin receptors. 


PF00791B 28.49 6.305e-12 253- 
308 PF00791B 28.49 1.909e-ll 
427-482 PF00791B 28.49 2.65 le- 
09 179-234 PF00791B 28.49 
3.890e-09 112-167 


246 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 2.500e-13 277-290 
PD00066 I3.92 9.143e-12 193-206 
PD00066 13.92 5,304e-ll 165-178 
PD00066 13.92 6.478e-l 1 249-262 
PD00066 13.92 3.423e-10 221-234 


247 


BL00406 


Actins proteins. 

- - •> f • * • '-^r. :.\v r . .': r. ■ : "i ^ 


BL00406D 12.58 6.400e-20 465- 
520 BL00406B5.474.857e-14 
249-304 BL00406E 8.44 l.OOOe- 
ni 522-572 BL00406C 6.75 
5.449e-ll 313-368 


248 


BL00951 


ER lumen protein retaining receptor 
proteins. 


BL00951C 19.35 1.000e-40 112- 
161 BL00951A 15.10 7.750e-39 
21-57 BL00951D 13.94 6.000e-38 
161-196 BL00951B 14.23 3.100e- 

31 57-88 


252 


BL01113 


Clq domain proteins. 


BL01113A 17.99 9.129e-15 200- 
227 BL01113A 17.99 4.818e-14 
194-221 BL01113A 17.99 7.81 8e- 
14 182-209 BL01113A 17.99 
L730e-13 185-212 BL01113A 
17.99 6.595e-13 191-218 
BLOl 1 13A 17.99 6.077e-12 203- 
230 BLOl 113A 17.99 9.1826-11 
179-206 BL01113A 17.99 2.532e- 
10 176-203 BL01113A 17.99 
9.0436-10 218-245 BL01113A 
1 7.99 9.426e- 1 0 209-23 6 
BL01113A 17.994.1156-09 137- 
164 


257 


BL00845 


CAP-Gly domam protems. 


BL00845 16.43 1.837e-21 466-491 


259 


PR0Q248 


METABOTROPIC GLUTAMATE 
urCK olONA 1 UKb 


PR00248G 12.67 2.688e-09 53-78 


260 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 3.4006-10441-452 
BL00678 9.67 5.8006-10481-492 
BL00678 9.67 8.8006-10358-369 


261 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 3,400e-10 415-426 
BL00678 9.67 5.800e-10 455-466 
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262" 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL006y8 9.0/ 8.»0Ue-10 332-343 
BL00678 9.67 3. 400e- 10 468-479 
BL00678 9.67 5.800e-10 508-519 
BL00678 9.67 8.800e- 10 385-396 


263 


BL50002 


SrC homoloSV 3 rSH3^ dnmain nrntM-nc 
"WAiivriVjuj' ^ ytji.±jj iJixjlXiaiix yTXJvGlLtS 

profile. 


BL^ODOiB 13.Jif 2.200e-I0415- 
429 


264 


BL00049 


Ribosomal protein L14 proteins. 


BL00049C 17.38 3.040e-12 94- 

130 


265 


PD01469 


GLYCOPROTEIN PROTEm 
PRECURSOR SA. 


PD01469 20.69 2.091e-14 438-470 


266 

267 


PD01469 

BL00567 


GLYCOPROTEIN PROTEOJ 
PRECURSOR SA. 
Phosphoribulokinase proteins. 


PD01469 20.59 2.0916-14 279-31 1 


269 
272 


BL00049 
BL01115 


Ribosomal protein L14 proteins. 
GTP-binding nuclear protein ran proteins 


BL00567A 10.66 1.161e-12 36-55 
BL00049C 17.38 2.688e-28 92- 
128 BL00049B 18.42 6.806e-24 
54-86 BL00049A 13.86 8.333e-19 
19-42 BL00049D 13.47 5.765e-12 
129-140 


273 


PR00021 


SMALT. PPOT rhJT? Pir'TJ D'or^TCTXT ~ 

SIGNATURE 


BL01115A 10.22 9.735e-12 14-58 
PR00021A4.31 1.911e-09 819- 
832 


275 


PR00179 


LIPOCALIN SIGNATURE 


?K00119B 9.56 2.895e-13 124- 
137 PR00179A 13.78 3.250e-ll 
36-49 PR00179C 19.02 6.040e-ll 
154-170 


276 


PR00449 


TRANSFORMING PROTEIN P2i RAS 
SIGNATURE 


PR00449A 13.20 8.364e-17 22-44 
PR00449C 17.27 l.OOOe-13 62-85 
PR00449E 13.50 4.000e-12 172- 
195 PR00449B 14.34 5.680e-10 
45-62 




: BM0140 r. 


Ubiquitin carboxyl-tennmal hydrolase . 
family I cysteine activ. 


BL00140D 22.64 l.OOOe-40 161- 
205 BL00140C 11 80 9 053e-'?0 
79-104 BL00140A 15.96 9.400e- 
28 5-35 BL00140B 12.29 4.649e- 
17 37-55 


279 


PD02712 
BL00678 


ELEMENT TRANSPOSASE FOR 
TRANSPOSON TRANSPOSABLE. 


PD02712A 23.03 8.013e-09 47-83 " 


282 
283 


DM00892 
BL00048 


Trp-Asp (WU) repeat proteins proteins. 
3 RETKOVIRAL PTJOTPrisJA orr 

Protamme PI proteins. 


BL00678 9.67 1.474e-09 100-111 
DM00892C 23.55 4.767e-21 864- 
898 


286 


PR00081 


glucose/ribitol ■ ' 

DEHYDROGENASE FAMTT V 
SIGNATURE 


BL00048 6.39 9.550e-09 56-83 
PR00081A 10.53 1.878e-ll 36-54 


287 


PR00310 


ANTI-PROLIFERATTVF protptm 

BTGl FAMILY SIGNATURE 


PR00310B 10.59 4.23 le-17 29-59 
PR00310D 9.10 6.679e-16 89-119 


289 


PD01066 


PRO'I BIN ZINC FINGFR 7mr 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.000e-36 37-76 


293 


BL00979 


G-protein coupled receptors family 3 
oroteins. 


BL00979L 20.63 3.8006-12 111- 

152 


295 
~296 


PD0241 1 


fKUlBlN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 7.0006-16 195-229 




BL01064 ] 

] 


Pyridoxamine 5'-phosphate oxidase ] 
proteins. 


3L01064A 27.84 8.313e-28 77- " 
129 BL01064C 15 22 7 136e-25 
?02-235 


297 


BL00030 ] 
I 


tuKaryotic RNA-binding region RNP- 1 I 
)roteins. j 

1 
1 


3L0O030A 14,39 2.929e-13 37-56 
3L00030B 7.03 L900e-ll 167- 
77 BL00030A 14.392.0006-10 
28-147 
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298 


BL01183 


ubi£/C0Q5 metiiyltraiisferase family 
proteins. 


BL01183B 21.31 6.660e-12 143- 
188 


299 


BL01279 


Protein-L-isoaspartate(D-aspartate) 0- 
methyltransferase signa. 


BL01279A 24.27 5.862e-l 1 57- 
105 


301 


BL00191 


Cytochrome b5 femily, heme-binding . 
domain proteins. 


BL00191K 17.38 4.95 le-27 184- 
228 BL00191J 11.37 6.447e-17 
128-150 


302 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 3.893e-16 33-67 


306 


PF01140 


Matrix protein (MA), pi 5. 


PF01140D 15.54 2.988e-09 416- 
451 


307 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 4;818e-21 59-81 
PR00245C 7.84 5.154e-20 238- 
254 PR00245D 10.47 4.000e-15 
274-286 PR00245B 10.38 8.200e- 
15 177-192 PR00245E 12.40 
5.714e-12 291-306 


309 


BL00203 


Vertebrate metaliothioneins proteins. 


BL00203 13.94 2.245e-10 612-658 


310 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 7.632e-23 119- 
159 BL00237C 13.19 3.864e-15 
251-278 BL00237D 11.23 3.739e- 
12 312-329 


311 


BL00380 


Rhodanese proteins. 


BL00380D 15.90 8.200e-28 110- 
136 BL00380G 11.26 5.800e-16 
267-280 BL00380B 14.77 7.000e- 
14 49-62 BL00380F 9.76 5.886e- 
13 203-214 BL00380C 15.67 
7.387e-l3 82-98 BL00380E 12.44 
7.000e-ll 181-193 BL00380A 
10.48 LOOOe-09 10-20 


312 


BL00227 


Tubulin subunits alpha, beta, and gamma 
proteins. * , - ; * " 


BL00227B 19.29 l.OOOe-40 50- 
105 BL00227C 25.48 LOOOe-40 . 
1 1 1-163 BL6o'227D 1 8.46 l.b'oOe- 
40 220-274 BL00227F 21.16 
1.000e-40 372-426 BL00227A 
24.55 3.250e-39 1-35 BL00227E 
24.15 8.500e-34 324-359 


327 


BL00232 


Cadherins extracellular repeat proteins 
domain proteins. 


BL00232B 32.79 7.362e-21 225- 
273 BL00232B 32.79 2.588e-17 
435-483 BL00232B 32.79 6.301e- 
15 116-164 BL00232B 32.79 
6.769e-13 330-378 BL00232C 
10.65 9.341e-12 223-241 
BL00232C 10.65 5.696e-ll 328- 
346 BL00232C 10.65 3.9426-10 
433-451 


329 


PD02749 


TRANSCRIPTION PROTEIN FACTOR 
BTF3 REGULATION NUCL. 


PD02749B 12.75 2.24 le-37 35-71 
PD02749C 13.96 4.892e-28 87- 
121 PD02749A 9.56 6.0006-15 2- 
15 


330 


PR00391 


PHOSPHATIDYLINOSITOL 
TRANSFER PROTEIN SIGNATURE 


PR00391E 12.50 7.785e-15 21 1- 
231 PR00391B8.39 1.000e-13 
83-104 PR00391D 12.21 9.328e- 
13 191-207 PR00391A7.83 
5.390e-ll 16-36 


332 


BL01030 


RNA polymerases M / 15 Kd subunits 
proteins. 


BL01030 23.44 1.818e-23 87-125 


337 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.929e-32 6-45 


340 


PD02711 


SYNTHASE 


PD02711B 14.26 1.973e-20 944- 
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ID 
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ACCESSION 
NO. 



DESCRIPTION 



PHOSPHORIBOSYLFGRMYLGlTy: 



RESULTS* 



96S 



346 



Annexins repeat proteins domain 
proteins. 



PR00345 



STATHMIN FAMILY SIGNATURE 



BL00223C 24.79 l.OOOe-40 245- 
300 BL00223B 28.47 8.714e-38 
168-218 BL00223A 15.59 8.250e- 
27 98-132 BL00223A 15.59 
8.750e-27 26-60 BL00223C 24.79 
9.438e>16 13-68 BL00223C 24.79 
2.735e-15 85-140 BL00223A 
15.59 2.253e-l 1258-292 



PR00345B 7.12 2.800e-28 81-1 10 
PR00345E 8.54 7.652e-28 158- 
183 PR00345C4.54 9.100e-28 
110-134 PR00345D 10.97 1.964e- 
24 134-158 PR00345A 13.46 
5.645e-16 52-71 



348 



351 



354 



358 



Ribosomal protein L16 proteins. 



PR00388 



BL00018 



3',5'-CYCLIC NUCLEOTIDE CLASS H 
PHOSPHODIESTERASE SIGNATURE 



BL00678 



EF-hand calcium-binding domain 
proteins. 



DM01206 



Trp-Asp (WD) repeat proteins proteins. 



CORONA VIRUS NUCLEOCAPSID 
PROTEIN. 



BL00586B 17.00 3215e-15 184- 
221 



PK00388A 10.45 2.778e-09 86- 
105 



BL00018 7.41 3.118e-ll 160-173 
BL00018 7.41 2.350e-10 244-257 



BL00678 9.67 1.947e-09 256-267 



DM01206B 10.693.2786-09 175- 
195 DM01206B 10.69 6.696e-09 
183-203 DM01206B 10.69 
8.633e-09 132-152 DM01206B 
10.69 8.861e-09 181-201 
DM01206B 10.69 9.316e-09 177- 
197 



362 



365 



366 



PD01498 



OXIDASE BIOSYNTHESIS 
OXIDOREDUCTASE PORP. - 



BL00178 



OXIDASE BIOSYNTHESIS ^ 
OXIDOREDUCTASE PORP. 



PD01498C 24.90 6.880e-14 219- 

,263 . 



Aminoacyl-transfer RNA synthetases 
class-I proteins. 



PD01498C 24.90 6.880e-14219- 

263 



BL00523 



Sulfatases proteins. 



BL00178B7.il 1.0006-11589- 
600 BL00178A 14.23 8.500e-09 
46-56 



BL00523E 19.27 l.OOOe-23 318- 
348 BL00523A 13.36 5.500e-16 
30-47 BL00523B 8,64 1.964e-13 
78-90 BL00523C 12.64 9.625e-13 
129-140 BL00523G 9.46 5.5006- 
10 506-516 



370 



371 



372 



373 



375 



377 



BL00880 



BL00107 



PR00211 



BL00279 



PD01066 



PD01066 
BL00598 



Protein kinases ATP-binding region 
proteins. 



BL00107A 18.39 4.81 8e-09 21 -52 



Acyl-CoA-binding protein. 



Protein kinases ATP-binding regiraT 
proteins. 



BL00880 17.52 l,000e-40 75-125 



ULUTELIN SIGNATURE 



BL00107A 18.39 l.OOOe-23 276- 
307 BL00107B 13.31 1.692e-12 
342-358 



Membrane attack complex components / 
perforin proteins. 



PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 



PRO FEIN ZINC FINGER ZINC- 
FINGER METAL-BINDIN G NU. 
Chrom o domain proteins. 



PR0021 IB 0.86 6.602e-l 1 326- 
347 PR0021 IB 0.86 6.106e-10 
320-341 PR0021 IB 0.86 3.167e- 
09333-354 



BL00279E 37.1 1 9.349©-10 749- 
797 



PD01066 19.43 1.231e-33 10-49 



FDOloee 19.43 7.5636-28 10-49 
BLU0598 14.45 5.781e-.163-2r 
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ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


380 


PR00413 


HALOACID 

DEHALOGENASE/EPOXIDE 
HYDROLASE FAMILY SIGNATURE 


PR00413D nas 8.941e-09 864- 
878 


383 


PR00413 


HALOACID 

DEHALOGENASE/EPOXIDE 
HYDROLASE FAMILY SIGNATURE 


PR00413D 1 128 8.941e-09 864- 
878 


387 


BL01060 


Flagella transport protein fliP family 
proteins. 


BL0106.0A 15.65 1.535e-09 131- 

174 


388 


PR00209 


ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 


PR00209B4.88 6.318e.ll 1009- 
1028 


389 


PR00837 


ALLERGEN V5/TPX-1 FAMILY 
SIGNATURE 


PR00837B 11.64 l.OOOe-10469- 
483 


391 


BL00240 


Receptor tyrosine kinase class III 
proteins. 


BL00240B 24.70 7.9076-10 118- 
142 


392 


PR00014 


FIBRONECTIN TYPE m REPEAT 
SIGNATURE 


PR00014D 12.04 8.412e-10 691- 
706 


393 


PR00014 


FIBRONECTIN TYPE ID REPEAT 

SIGNATURE 


PR00014D 12.04 8.412e-10 706- 
721 


394 


BL01209 


LDL-receptor class A (LDLRA) domain 
proteins. 


BL01209 9.31 3,368e-15 47-60 
BL01209 9.31 5.500e-13 92-105 


395 


BL00634 


Ribosomal protein L30 proteins. 


BL00634 34.38 4.090e-13 70-121 


396 


BL01013 


Oxysterol-binding protein femily 
proteins. 


BL01013D 26.81 8.000e-26 358- 
402 BL01013A 25.14 7.2316-21 
45-81 BL01013C9.97 l.OOOe-13 
132-142 BL01013B 11.33 LOOOe- 
11 110-121 


397 


BL00930 


Peripherin / rom-1 proteins. 


BL00930E 17.80 l.OOOe-40 56-92 
BL00930D 9.12 4.632e-37 12-56 
BL00930F 16.91 2.800e-36 92- 
,133 


;400" ' 


.PR00780 


•LEUSERpm:2;siGNA ; ; 


" PR00780B 4.89 4.491e^09 262- 
285 


401 


PR00819 


CBXX/CFQX SUPERFAMCLY 
SIGNATURE 


PR00819B 10.83 7.158e-ll 4-20 


403 


BL003SI 


Endopeptidase Clp serine proteins. 


BL00381C 23.84 1.250e-32 150- 
194 BL00381A 16.48 2.286e-22 
74-111 BL00381B 21.42 8.326e- 
14 78-130 


405 


BL01105 


Ribosomal protein L35Ae proteins. 


BL01105A 17.37 l.OOOe-404-49 
BL01105B 12.95 l.OOOe-40 68- 
108 


406 


BL00344 


GATA-type zinc finger domsiin proteins. 


BL00344 17.99 7.000e- 12 814-852 


407 


PR00211 


GLUTELIN SIGNATURE 


PR00211B 0.86 9.750e-09 73-94 


409 


PR00910 


LUTEOVIRUS 0RF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 4.321e-09 9-22 


410 


BL00762 


WHEP-TRS domain proteins. 


BL00762A 23.43 l.OOOe-28 752- 
789 BL00762A 23.43 4.400e-21 
903-940 BL00762A 23.43 5.415e- 
18 825-862 BL00762B 16.14 
8.759e-12 1154-1168 


412 


BL00690 


DEAH-box subfamily ATP-dependent 
helicases proteins. 


BL00690B 13.38 5.320e-15 262- 
280 BL00690A6.87 1.818e-13 
230-240 


415 


BL00227 


Tubulin subimits alpha, beta, and gamma 
proteins. 


BL00227B 19.29 l.OOOe-40 52- 
107 BL00227C 25.48 l.OOOe-40 
113-165 BL00227D 18.46 l.OOOe- 
40 222-276 BL00227F21.16 
l.OOOe-40 382-436 BL00227E 
24.15 1.750e-34 326-361 
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rn 

NO: 


ACCESSION 


DESCRIPTION 


RESULTS* 


416 

418 


PF00992 
BL00541 


Troponin. 

Nuclear transition protein 1 proteins. 


BL00227A 24.55 l.OOOe-33 1-35 

PF00992A 16.67 1.711e-09 557- 
592 

BL00541 8.44 9,875e-09 256-310 " 


419 
420 

421 


BL00541 
rruUoDo 


Nuclear transition protein 1 proteins. 
SET domam proteins. 

Irp-Asp (wjj; repeat proteins proteins. 


BL00541 8.44 9.875e-09 197-251 
PF00856A 26.14 9.074e-13 901- 
938 PF00856B 16.42 2.397e- 12 
951-973 


423 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


BL00678 9.67 8.200e-12 33-44 
PD01066 19.43 8.600e-30 130-169 


424 
426 


PF00564 
PR00988 


Octicosapeptide repeat proteins. 

URIDINE KINASE SIGNATURE 


PF00564B 24.74 1.305e-17 421- 

472 

PR00988A 6.39 4.569e-12 3-21 


477 

428 


rKUUyoo 

BL00478 


URIDINE KINASE SIGNATURE 
LIM domain proteins. 


PR00988A 6.39 4.569e-12 3-21 
BL00478B 14.79 3.250e-13 115- 
130 BL00478B 14.79 9.036e-13 
50-65 


All 
AlO 


BL00282 


Kazal serine protease inhibitors femily 
proteins. 


BL00282 16.88 8.875e-12 464-487 




PDQ0930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 7.800e-18 316- 
357 PD00930A 25.62 9.617e-12 
125-151 PDQ0930B 33.72 2.52 le- 
10 214-255 




rDOlOoo 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 4.649e-34 34-73 


434 


PR00449 


TRANSFORMING PROTEIN P21 RAS 

SIGNATURE 


PR00449A 13.20 7.563e-ll 56-78 


436 


PR00120 


H+-TRANSPORTING ATPASE 
(PROTON PUMP) SIGNATURE 


PR00120C 9.90 5.800e-19 705- 

722 


3 437 
438 


BL00115 
PF00628 


Eukaryotic RNA polymerase n 
heptapeptide repeat proteins. 

PHD-finger. 


BLOOl 15T 8.45 7.273e-29 1208- 
1242 BLOOl 15Q 18.08 2.776e-21 
953-983 BLOOl I5Y 11.86 S.OOOe- 
17 1604-1650 BLOOl 15M 19.19 
8.130e-16 731-774 BL00115H 
14.34 9.392e- 16 463-496 
BLOOl 15A 15.44 7.414e-15 43-82 
BLOOl I5R 6.50 6.128e-14 983- 
1010 BLOOl 15J 16.71 9.289e-I4 
591-617 BLOOl 151 8.33 4,336e- 
13 535-590 BLOOl 15L 12.25 
5.939e-13 662-694 BL00115G 
11.65 6.01 le-13 435-463 
Djuuui I jis. 1 j.oi j.4l7e-10 617- 
659 BLOOl 150 16.76 5.805e-10 
863-913 BLOOl 15P 11.54 7.538e- 
10 913-953 BLOOl 15S 18.24 
7.968e-10 1010-1052 BL00115U 
10.34 4.4756-091242-1265 


440 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PF00628 15.84 4.536e-10219-234 
PD01066 19.43 6.3516-34 10-49 


~44i 
""l42 


PR00309 
BL00600 : 


ARRESTIN SIGNATURE 
^inotransferases class-DI pyridoxal- ] 


PR00309A 9.68 5.250e-24 32-55 
rK00309D 7.09 4.938e-23 290- 
309 PR00309B7.812.800e-21 
59-88 PR00309C8.22 1.621e-19 
165-183 PR00309E 9.82 9.438e- 
15 374-389 

3L00600B 19.60 7.324e-14 103- 
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SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






phosphate attachment si. 


129 BL00600G 12.43 2.125e-12 
306-325 BL00600F 8.77 8.105e- 
12 271-284 BL00600E 16.43 
3.1670-11228-257 BL00600D 
8.71 8.650e-09 207-221 


443 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 
family 2 proteins. 


BL00972A 1 1.93 3.160e-18 69-87 


444 


BL00349 


CTF/NF-I proteins. 


BL00349A 10.07 l.OOOe.40 8-54 
BL00349C 9.33 l.OOOe-40 82-125 
BL00349E 10.79 LOOOe-40 152- 
195 BL00349F 11.81 l.OOOe-40 
213-255 BL00349H 15.70 7.387e- 
36361-399 BL00349B 10.51 
2.2276-34 54-82 BL00349D 11.70 
9.100e-34 125-152 BL00349G 
19.72 5.78 le-30 323-356 


445 


BL00154 


E1-E2 ATPases phosphorylation site 
proteins. 


BL00154F 8.23 8.941e-21 271- 
295 BL00154E 20.37 2.620e-15 
124-165 


448 


'DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 4.882e-ll 82-115 
DM00215 19,43 6.492e-09 87-120 


451 


BL01283 


T-box domain proteins. 


BL01283A 24.15 3.100e-40 1 12- 
160 BL01283D 11.70 6.000e-39 
253-286 BL01283B 23.17 6.538e- 
38 170-212 BL01283C 13,05 
7.750e- 19 222-236 


452 


PR00420 


AROMATIC-RING HYDROXYLASE 
(FLAVOPROTEIN 
MONOOXYGENASE) SIGNATURE 


PR00420A 14.78 2.579e-ll 3-26 


453 


_ PR00162 


RIESKE 2FE-2S SUBUNIT 
SIGNAtURE 


PR00162B 12,77 7.429e-17215- 
'-228 PR00162A:?:3'5 2;324e-14^ r-- 
193-205 PROdl 62C 8'. 10 7, i20e- 
14 227-240 


454 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.000e-30 87-126 


456 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 9.333e-18 1 149- 
1192 


457 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.737e-24 16-55 


459 


BL00290 


Immunoglobulins and major 
histocompatibility complex proteins. 


BL00290A 20.89 L529e.l4 154- 
177 BL00290B 13.17 9.000e- 12 
214-232 


460 


PR00413 


HALOACID 

DEHALOGENASE/EPOXIDE 
HYDROLASE FAMILY SIGNATURE 


PR00413F 14.91 7.333e-ll 193- 
214 PR00413E 15.78 5.7r4e-09 
175-192 


463 


PR00759 


BASIC PROTEASE (KUNITZ-TYPE) 
INHIBITOR FAMILY SIGNATURE 


PR00759B 1 1.26 8.385e-09 74-85 


466 


BL00019 


Actinin-type actin-binding domain 
proteins. 


BL00019D 15.33 4.200e-19300- 
330 


467 


BL00019 


Actinin-type actin-binding domam 
proteins. 


BL00019D 15.33 4.2006-19300- 
330 


469 


PR00153 


C YCLOPHILIN PEPTIDYL-PROLYL 
CIS-TRANS ISOMERASE 
SIGNATURE 


PR00153D 11.99 3.250e-15 510- 
523 PR00153C 1 1.01 4.682e-14 
495-51 1 PR00153E 9.10 8.548e- 
14 523-539 PR00153B 11.57 
L720e-13 452-465 


470 


BL00491 


Aminopeptidase P and proline 
dipeptidase proteins. 


BL00491C 12.15 3.9126-09 557- 
572 


471 


PD00289 


PROTEIN SIB DOMAIN REPEAT 


PD00289 9.97 l.OOOe-14 1482- 
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ACCESSION 
NO. 



DESCRIPTION 



RESULTS* 



474 



475 



476 



477 



PRESYNA. 



BL50040 



Elongation fector 1 gamma chain profile. 



1496 PD00289 9.97 8.650e-l 1 
1122-1136 



BL0I144 



PR00007 



BL50002 



Ribosomal protein L3 1 e proteins. 
COMPLEMENT CIQ DOMAIN 
SIGNATURE 



Src homology 3 (SH3) domain proteins 
profile. 



BL50040D 17.41 l.OOOe-40 279- 
329 BL50040E 18.79 l.OOOe-40 
333-388 BL50040F 18.99 5.320e- 
40 390-428 BL50040C 22.62 
3.739e-38 141-184 BL50040B 
13.65 7.000e-30 59-85 BL50040A 
12.98 1.450e-14 10-22 



BLOl 144 25.07 I.OOOe-4022^ 



PR00007C 15.60 2.421e-21 589-" 
611 PR00007B 14.16 3.500e-21 
544-564 PR00007A 19.33 6.897e- 
20 517-544 PR00007D9.64 
6.571e-12 623-634 



BL50002A 14.19 5.846e-10 170- 
189 



480 



481 



482 



DM01970 



PR00868 



0 kw ZK632. 12 YDR3 13C 
ENDOSOMAL 10. 



DNA-POLYMERASE FAMILY A (POL 
T) SIGNATURE 



DM01970B 8.60 9.500e-17 967- 
980 



BL00027 



BL00061 



'Homeobox' domain proteins. 



PR00868C 13.76 5,688e-17284- 
308 PR00868A 16.33 3.1 86e-13 
224-247 PR00868H 12.51 3.388e- 
13 431-448 PR008681 10.87 
7.938e-l 1462-476 PR00868E 
13.19 1.608e-10 340-366 



Short-chain dehydrogenases/reductases 
family proteins. 



BL00027 26.43 9.182e-22 53-96 



Src homology 3 (SH3) domain proteins 
profile. 



BL00061B 25.79 3.647e-21 188- 
226 



-485 



BL50002 



PF0002a 



Ank repeat proteins. 



BL50002A 14.19 1.750e-12 1032- 
1051 



PF00023A lj5:03 9.-625e-10 760- 
776 PF00023A 16.03 3.571e-09 
715-731 



487 



489 



PD02870 



RECEPTOR INTERLEUKIN-1 
PRECURSOR. 



PR00370 



FLAVIN-CONTAINING 
MONOOXYGENASE (FMO) 
SIGNATURE 



PD01675 



GLYCOPROTEIN JSIAJOR ENVELOPE 
PROBABLE U3. 



PD02870B 18.83 9262e-20 103-" 
136 PD02870D 15.74 9.426e-09 
201-236 



PR00370G 10.45 3.769e-28 471- 
493 PR00370B 10.91 l.OOOe-24 
27-46 PR00370C 12.72 4.000e-21 
140-157 PR00370E 11.96 9.229e- 
21 320-339 PR00370D 16.33 
L750e-20 185-204 PR00370F 
17.75 7.395e-20 375-395 
PR00370A3.35 2.038e-18 4-20 



PD01675C 19.89 2.330e-10 55-89 



493 
494 



BL00211 



BL00211 
BL00211 



ABC transporters family proteins. 



ABC transporters family proteins. 



BL00211A 12.23 5.050e-09 45-57 



BL00211A 12.23 5.050e-09 45-57 



495 



BL00027 



ABC transporters family proteins. 



'Homeobox' domain proteins. 



BL00211A 12.23 5.050e-09 58-70 



BL00027 26.43 6.786e-12 509-552 
BL00027 26.43 9. 143e- 12 319-362 
BL00027 26.43 2.600e-l 1 627-670 
BL00027 26.43 3.625e-10 779-822 



499 



BL00107 



Protein kinases ATP-binding region 
proteins. 



BL00383 



Tyrosine specific protein phosphatases 



BL00107A 18.39 5.800e-22 214- 
245 BL00107B 13.31 l.OOOe-13 
281-297 BL00107A 18.39 3.520e- 
13 583-614 BL00107B 13.31 
8.615e-12 652-668 



BL00383E 10.35 l.OOOe-14 1902- 
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RESULTS* 






proteins. 


1913 BL00383D11.923.D77e-14 
1862-1875 BL00383A 13.34 
5.500e-14 1730-1745 BL00383C 
10.10 2.000e-13 1785-1796 
BL00383F 15.51 9.069e-12 1940- 
1956 BL00383B7.61 L692e-ll 
1755-1764 


501 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11,36 1.360e-09 136- 
150 PR00019A 11.19 L667e-09 
9M05 PR00019B 11.36 4.600e- 
09 160-174 


503 


BL00226 


Intermediate filaments proteins. 


BL00226D 19.10 LOOOe-40 367- 
414 BL00226B23.86 6.143e-27 
195-243 BL00226A 12.77 7.840e- 
14 96-111 BL00226C 13.23 
2.600e-13 309-340 BL00226C 
13.23 6.143e-12 266-297 
BL00226B 23.86 1.209e-09 146- 
194 


505 


PD02407 


3-BISPHOSPHOGLYCERATE- 
INDEPENDENT PHOSPHOGLYCER. 


PD02407F 7.61 6,739e-09 916- 
930 


506 


PF00632 


HECT-domain (ubiquitin-transferase). 


PF00632C 20.66 9.830e-19 991- 
1023 PF00632B 18.45 1.155e-ll 
940-968 


507 


BL01082 


Ribosomal protein L7Ae proteins. 


BL01082 20.37 4^73e-20 76-1 16 


508 


BL00678 


Tip-Asp (WD) repeat proteins proteins. 


BL00678 9.67 2.421e-09 493-504 


509 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 2.421e-09 473-484 


510 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320B 12.19 4.774e-ll 567- 
582 PR00320B 12.19 5.886e-10 
763-778 PR00320C 13.01 6.760e- 
10 567-582 PR00320A,16.74 
7.618e-10 846-861 PR60320A 
16.74 3.415e-09 763-778 
PR00320A 16.74 6.268e-09 567- . 
582 


511 


BL00479 


Phorbol esters / diacylglycerol binding 
domain proteins. 


BL00479C 12.01 3.250e-12 170- 
183 


512 


BL50058 


G-protein gamma submiit profile. 


BL50058 27.23 7.494e-09 10-58 


513 


BL00524 


Somatomedin B domain proteins. 


BL00524A 9.65 8.925e-14 80-101 


515 


BL00041 


Bacterial regulatory proteins, araC family 
proteins. 


BL00041 23.99 1.964e- 19 492-524 


516 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 8.500e-13 391-404 


517 


BL00415 


Synapsins proteins. 


BL00415E 4.82 9.291e-09 959- 
996 


518 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 9.471e-12 126- 
145 


519 


BL00290 


Immmioglobulins and major 
histocompatibility complex proteins. 


BL00290B 13.17 4.750e-09 47-65 


522 


PR00505 


D12 CLASS N6 ADENINE-SPECIFIC . 
DNA METHYLTRANSFERASE 
SIGNATURE 


PR00505A 14,15 7.128e-09 364- 
381 


525 


BL00312 


Glycophorin A proteins. 


BL00312B 9.22 5.781e-10 891- 

920 


528 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.500e-32 16-55 


529 


PR00254 


NICOTINIC ACETYLCHOLINE 
RECEPTOR SIGNATURE 


PR00254D 15.50 4.000e-17 131- 
150 PR00254A 11.23 4.706e-14 
61-78 PR00254C 11.36 4.0006-12 
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ID 
NO: 


ACCESSION 


DESCRIPTION 


RESULTS* 








113-126 PR00254B 12.97 1. 486e- 
1195-110 


531 


BL00741 


Guanine-nucleotide dissociation 
stimulators CDC24 family sign. 


BL00741B 14.27 6.870e^l6 787- ' 
810 


532 


* Xvi/vi yj 


MYOMN WEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 3.143e-34 447- 
476 PR00193C 12.60 7.632e-32 
216-244 PR00193B 11.69 7.750e- 
29 167-193 PR00193A 15.41 
2.588e-22 111-131 PR00193E 
19.47 2.200e-21 501-530 


533 


PD02870 


RECEPTOR INTERLBUKIN-1 

PP PPT TP QOP 


PD02870B 18.83 5.596e-09 348- 
381 


535 


PR00683 


SPECnON PLECKSTION 
HOMOLOGY DOMAIN SIGNATURE 


PR00683D 15.87 2.452e-10 465- ~ 
484 


536 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 6.684e-24 164-207 


538 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TIERMINAL TAIL SIGNATURE 


PR00239E 1 .58 2.739e-09 225- 
237 






Actins proteins. 


BL00406C 6.75 1 .OOOe-40 157- ^ 
212 BL00406B5.47 6.143e-37 
90-145 BL00406D 12.58 4.600e- 
36 291-346 BL00406E8.44 
2.200e-33 364-414 BL00406A 
9.95 4.441e-23 7-42 


540 




KlUOdOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 9.625e-I0 44-59 


541 




lUi5U5iUMAL Jf KU I'EIN P2 

SIGNATURE 


PR00456E 3.06 9.625e-10 44-59 


542 




Ank repeat proteins. 


PF00023A 16.03 7.857e-ll 138- 
154 


544 


PF00642 


Zinc finger C-x8-C-x5-C-x3-H type (and 
similar). : - ' " -^ . " : ' 


PF00642 1 1.59 9.082e-10 838-849 


546 


BL00383 


Tyrosine specific protein phosphatases 
proteins. 


BL00383E 10.35 4.115e-10 104- 
115 


547 


BL01226 


Hydroxymethylglutaiyl-coenzyrae A 
synthase proteins. 


BL01226A 13.79 l.OOOe-40 50-89 
BL01226C 13.51 l.OOOe-40 127- 
167 BL01226D 11.60 l.OOOe-40 
174-210 BL01226E 13.74 l.OOOe- 
40 212-253 BL01226H 17.74 
l.OOOe-40 386-434 BL01226I 
25.06 l.OOOe-40 460-508 
BL01226G 15.76 3.483e-32 292- 
321 BL01226B 13.35 1.818e-31 
95-127 BL01226F 9.78 8.714e-23 
253-271 


549 


*»tnJ 1/7 U*T 


oynuecans protems. 


BL00964B 12.05 2.426e-10 1246- 
1289 


551 


DM01930 


2 kw FINGER SMOC SMCY 


DM01930E 15.41 1.367e-37 170- 
215 DM01930F 14.16 8.232e-28 
267-303 DM01930B 19.86 
9.163e-10 37-71 


552 


BL00195 


Glutaredoxin protems. 


BL00195B 15.31 7,158e-09 9-29 


554 




Tyrosine specific protein phosphatases 
proteins. 


BL00383E 10.35 2.756e-12 436- 
447 


555 


PR00403 


WW DOMAIN SIGNATURE 


137 PR00403A 16.82 3. 9 12e- 10 
107-121 PR00403B 12.19 2.068e- 
09 76-91 


558 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 2.714e-26 76-98 
PR00380D 9.93 3.000e-24 275- 
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ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








297 PR00380C 13.18 5.154e-20 
226-245 PR00380B 12.64 9.400e- 
20 195-213 


559 . 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 5.333e-09 522-531 


561 


PD01795 


PROTEIN AMINOPEPHDASE 
PRECURSOR HYDROLASE SIGNA. 


PD01795B I1.562.333e-12 159- 
172 PD01795A 10.27 l.OOOe-09 
135-144 


562 


PD01795 


PROTEIN AMINOPEPTIDASE 
PRECURSOR HYDROLASE SIGNA. 


PD01795B 11.56 2.333e-12 110- 
123 PD01795A 10.27 l.OOOe-09 
86-95 


563 


BL00018 


EF-hand calcium-binding domain 
proteins. 


BL00018 7.41 1.3916-0941-54 


565 


BL00348 


p53 tumor antigen proteins. 


BL00348F 23.19 4.143e-09 188- 

231 


567 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM-BI. 


PD00301B 5.49 4.1 15e-09 284- 
295 


569 


PF00850 


Histone deacetylase family. 

'J 


PF00850E 8.88 6.553e-21 756-782 
PF00850D 14.76 1.519e-16 722- 
746 PF00850F 15.70 1,1 18e.ll 
794-827 PF00850G 22.75 8.375e- 
11 833-875 


570 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 4.960e-10 137-151 


571 


BL00518 


Zinc finger, C3HC4 type (RING fmger), 
proteins. 


BL00518 12.23 8.800e-ll 44-53 


573„ 


BL00299 


Ubiquitin domain proteins. 


BL00299 28.84 1.123e-lI 123-175 


. 574 


PF01140 


Matrix protein (MA), pl5. 


PFOl 1400 15.54 3.700e-10 986- 
1021 




BL00284 , 


Serpins^roteins. 


BL00284C 28.56:5.2006-26 200- 
242 BL0028^A 15.64 4:91^e-18 - 
71-95 Bl60284B 17.997.2616-15 
173-194 BL00284D 16.34 5.846e- 
13 306-333 BL00284E 19.15 
7.4296-12 387-412 


579 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.553e-29 15*54 


580 


BL50001 


Src homology 2 (SH2) domain proteins 
profile. 


BL50001B 17.404.5006-12 1010- 
1031 


581 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 3.1896-22 608- 
649 PD00930A 25.62 6.8066-17 
505-531 


584 


BL00612 


Osteonectin domain proteins. 


BL00612B 1 1 .35 2.034e-l 1 93- 
126 


585 


DM01551 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551C 14.62 8.8596-10 102- 
122 


586 


PF00628 


PHD-finger. 


PF00628 15.84 3.4556-12235-250 


587 


BL00027 


'Homeobox* domain proteins. 


BL00027 26.43 6.063e-10 85-128 


588 


PR00326 


GTPl/OBG GTP-BINDING PROTEIN 
FAMILY SIGNATURE 


PR00326A 8.75 7.525e-16 227- 
248 PR00326C 9.79 6.7606-15 
276-292 PR00326D 19.09 6.657e- 
13 293-312 PR00326B 16.74 
9.2296-13 248-267 


589 


BL00422 


Granins proteins. 


BL00422A 28.34 7.429e-09 2349- 
2378 


590 


BL00415 


Synapsins proteins. 


BLa0415N 4.29 9.794e-10 295- 
339 


591 


BL00128 


Alpha-lactalbumin / lysozyme C proteins. 


BL00128A 20.76 3.423e-13 35-65 
BL00128C 19.342.9806-11 110- 
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NO. 




RESULl'S* 


596 
597 


PR00049 


WILM'S TUMOUR PROTEIN 

SIGNATURE 


132 ' 

PR00049D 0.00 3.136e-09 31-46 




DM00547 


1 kw CHROMO BROMODOMAIN 
SHADOW GLOBAL. 


DM00547C 1 JJ{) 1.6676-19207- " 
229 DM00547E 13.94 6.200e-18 
319-342 DM00547B 11.28 

I. OOOe-17 179-193 DM00547D 

II. 60 9.250e-13 289-303 
DM00547F 23.43 6.727e-12 679- 
726 DM00547A 12.38 4.818e-ll 
158-170 


600 


PD01066 


PRO rKiN zrNrr FrMnF© rmr* 
FINGER METAL-BINDING NU 


PD01066 19.43 1.882e-27 13-52 


601 


BL00192 




BL00192A 1 1.90 6.400e-09 390- 
430 


602 


BL00936 


RibosomaJ protein L35 proteins. 


BL00936B 27.27 8.615e-09 118- " 
157 


603 


BL00936 


Ribosomal protein L35 proteins. 


BL00936B 27.27 8.61 5e-09 118- " 
157 


606 


FR00019 


SIGNATURE 


FR00019B 11.36 7.300e-10 292- 
306 PR00019AlI.19 5.667e-09 
323-337 


607 


PR00019 


LEUCINE-RICH REPEAT 

SIGNATURE 


PR00019B 11.36 7.300e-10 292- 
306 PR00019A 11.195.667e-09 

323-337 


608 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


FR00320C 13.01 9.500e-12 168- 
183 PR00320A 16.74 2.853e-10 
60-75 PR00320A 16.74 4.706e-10 
14-29 PR00320C 13.01 5.320e-10 
60-75 PR00320C 13.01 5.680e-10 
14-29 PR00320A 16.74 6.049e-09 
217-232 PR00320B 12.19 8.875e- 
09 168-183 


610 
FTlS 


BL00750 


Chaperonins TCP-1 proteins. 


BL00750B 16.17 l.OOOe-40 70- 
120 BL00750A 20.07 6.21 le-37 
26-69 BL00750G 20. 12 8.8006-31 
43 1 -471 BL00750F 1 8.40 5. 1 25e- 
30 370-411 BL00750E 24.59 
8.650e-29 295-332 BL00750H 
21.44 l.OOOe-27 489-524 
B100750C 25,65 5.345&'17 149- 
181 BL00750D 16.16 6.318e-14 
203-222 


615 


BL00766 
BL00256 


Tetrahydrotolate 

dehydrogenase/cyclohydrolase proteins. 
Adipokinetic hormone family proteins. ] 


BL00766B 24 49 1 OOOe-40 147 
190 BL00766E 13.78 l.OOOe-40 
322-359 BL00766C 25.86 5.500e- 
39 208-256 BL00766D 17.05 
4.536e-26 283-313 BL00766A 
21.48 6.063e-24 102-132 


r^i6 


BL00319 


ruu/xviuu^cuiu giy coproiein exiraceuuJar 
domain proteins. | ^ 


3L00256 12,28 3.298e-10 746-755 
BL00319C 17.12 9.053e-09419- 

i53 


617 


BL00030 i 


Eukaryotic RNA-bindine reainn "RTMP-i i 
proteins. 


ilAJW^UA 14.39 4.429e-09 44-63 


618 
n620 


DIuWK/JV j 


iukaryotic KNA-binding region RNP-1 i 
jroteins. 


JL00030A 14.39 4.429e-09 44-63 " 


1 622 


BL00325 
BL00972 I I 


Vctin-depoJymerizing proteins. I 

1 


5L00325B 21.66 5.817e-16 77- ~ 

23 






Jbiquitin carboxyl-teminal hydrolases I 


^L00972A 1 1.93 5.500e-19 213- ' 



171 



wo 01/57190 



PCT/USOl/04098 



SEQ 
ID 
NO: 



ACCESSION 
NO. 



DESCRIPTION 



RESULTS* 



family 2 proteins. 



231 BL00972D 22.55 2.742e-16 
501-526 BL00972B9.45 l.OOOe- 
11 297-307 BL00972C 16.48 
3.160e-l 1 370-385 BL00972E 
20.72 7.5176-10526-548 



625 



PD01066 



PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 



PD01066 19.43 6.333e-39 6-45 



628 



BL00039 



DEAD-box subfamily ATP-dependent 
helicases proteins. 



BL00039D 21.67 7.750e-31 478- 
524 BL00039A 18.44 2.000e-25 
198-237 BL00039C 15:63 1.844e- 
15 327-351 BL00039B 19.19 
5.636e- 14 242-268 



630 



PD00306 



PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 



PD00306A 10.26 7.000e-12232- 
246 



631 



PD00306 



PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 



PD00306A 10.26 7.000e-12 290- 
304 



633 



BL00785 



5'-nucIeotidase proteins. 



BL00785C 9.45 3.625e-16 108- 
122 BL00785E 15.85 4.000e-16 
279-295 BL00785A 9.73 6.500e- 
14 29-40 BL00785B 10.65 
5.500e-I3 72-86 BL00785D9.89 
4.000e-12 135-145 



636 



PR00832 



PAXILLIN SIGNATURE 



PR00832E 14.43 9.901e-14 85- 
108 



637 



PR00109 



TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 



PR00109B 12.27 6.362e-13 221- 

240 



638 



PF00635 



MSP (Major sperm protein) domain 
proteins. 



PF00635B 15.84 4.900e-ll 463- 

502 



639 



PR00860 



VERTEBRATE METALLOTfflONEIN 

SIGNATURE , . . 



PR00860B 7.04 1.900e-18 85-99 
PR00860C 9.61,1 .474e-14 99-109 
PR00&6OA-5.46 1 .720e-i4'63.76 



641 



PD00066 



PROTEIN ZINC-FINGER METAL- 
BINDL 



PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 



13.92 4,462e-15 
13.92 4.462e-15 
13.92 2.800e-14 
13.92 2.800e-14 
13.92 2.8006-14 
13.92 7.000e-14 
13.92 8.8006-14 
13.92 8.800e-14 
13.92 1.500e-13 
13.92 7.0006-13 
13.92 7.000e-13 
13.92 9.5006-13 
13.92 9.500e-13 
13.92 9.5006-13 
13.92 8.6156-10 
13.92 1.6006-09 



271-284 
299-312 
327-340 
383-396 
411-424 
355-368 
439-452 
495-508 
551-564 
467-480 
523-536 
215-228 
243-256 
579-592 
607-620 
187-200 



642 



BL00961 



Ribosomal protein S28e proteins. 



BL00961B 1 1.24 7.429e-37 67- 
100 BL00961A 9.90 4.0796-26 

42-66 



643 



BL00585 



Ribosomal protein S5 proteins. 



BL00585A 28.43 1.391e-40 103- 
155 BL00585B 18.78 3.2506-30 
193-230 



647 



BL00678 



Trp-Asp (WD) repeat proteins proteins. 



BL00678 9.67 9.400e-10 181-192 



648 



PR00876 



NEMATODE METALLOTfflONEIN 
SIGNATURE 



PR00876C 6.15 9.229e-09 112- 
126 



652 



PD01066 



PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 



PD01066 19.43 5.941e-2729-68 



653 



BL00047 



Histone H4 proteins. 



BL00047A 13.53 l.OOOe-40 2-41 
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S£Q 
ID 
NO: 



654 



ACCESSION 
NO. 



PD01066 



DESCRIPTION 



PROIEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 



RESULTS* 



BL00047B6.51 l.429e-404I-74" 
BL00047C 12.18 1.310e-38 74- 
104 



PD01066 19.43 4.109e-25 30-69 



657 



BL00518 



GTP-binding nuclear protein ran proteins. 



Zinc finger, C3HC4 type (RING finger), 

proteins. 

Serine/tfareonine specific protein 



BLQl 1 ISA 10.22 3.483e-17 19^ 



BL00518 1223 8.286e-10 31-40" 



phosphatases proteins. 



659 



PD00066 



PROTEIN ZINC-FINGER METAL- 
BINDI. 



BL00125B 21.48 l.OOOe-40 89-"" 
135 BL00125C 19.97 l.OOOe-40 
153-200 BL00125D33.il l.OOOe- 
40213-268 BL00125A 14.83 
8.941e-38 47-84 



660 

i6r 



PD01066 



BL00795 



662 



BL00469 



663 



BL01160 



PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 



Involucrin proteins. 



PD00066 13.92 8.200e- 16 492-505 
PD00066 13.92 9.308e-15 380-393 
PD00066 13.92 6.000e-13 352-365 
PD00066 13.92 7.000e- 13 240-253 
PD00066 13.92 7.500e-13 268-281 
PD00066 13.92 7.500e-13 408-421 
PD00066 13.92 2.174e-l 1 464-477 
PD00066 13.92 l.OOOe- 10 436-4 49 
PD01066 19.43 2.189e-26 29-68 



Nucleoside diphosphate kinases protemsT 



Kinesin light chain repeat proteins. 



BL00795C 17.06 7.882e-15 193- " 
238 BL00795C 17.06 3.797e- 13 
187-232 BL00795C 17.06 5.014e- 
13 188-233 BL00795C 17.06 
4.506e-12 196-241 BL00795C 
17.06 7.896e-12 191-236 
BL00795C 17.06 1.667e-ll 185- 

230 BL00795C 17.06 2.000e-ll 
198-243 BL00795C 17.06 3.778e- 
11 171-216 BL00795C 17.06 
6.111e-ll 197-242 BL00795C 
17.06 6.444e-ll 194-239 
BL00795C 17.06 8.000e-ll 189- 
234 BL00795C 17.06 8.556e- 11 
192-237 BL00795C 17.06 1.733e- 
10 195-240 BL00795C 17.06 
2.779e-10 184-229 BL00795C 
17.06 4.035e-10 199-244 
BL00795C 17.06 5.081e-10 186- 

231 BL00795C 17.06 6,965e-10 
190-235 BL00795C 17.06 2.700e- 
09 200-245 BL00795C 17.06 
5.800e-09 175-220 BL00795C 
17.06 6.500e-09 182-227 
BL00795C 17.06 6.600e-09 201- 
246 BL00795C 17.06 6.600e-09 
202-247 BL00795C 17.06 6.600e- 
09 208-253 



BL00469 22.22 l.OOOe-40 149-204 



BLOl 160B 19.54 9.41 le-1 1 33 1- 
385 



BL00601 



665 
"666" 



BL00082 



DM01537 



Tryptophan pentad repeat proteins (IRF 
family) proteins. 



Extradiol ring-cleavage dioxygenases 
proteins. 



BL00601A 20.29 5.500e-23 7-46 
BL00601B 20.92 3.631e-13 69-98 



Kw SKI2 W SK12 NUCLEOLAR 



BL00082A 19.07 8.615e-12 49-72 



DM01537B 21.63 4.0736-37834^" 
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SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






HELICASE. 


881 DM01537B 21.63 9.750e-21 
1669-1716 DM01537A 15.14 
8.650e-l 8 698-718 DM01537A 
15.14 6.766e-12 1537-1557 


667 


DM01537 


kw SKI2W SKH NUCLEOLAR 
HELICASE. 


DM01537B 21.63 7.923e-38 820- 
867 DM01537B 21.63 9.750e-21 
1655-1702 DM01537A 15.14 
8.650e-l 8 684-704 DM01537A 
15.14 6.766e-12 1523-1543 


669 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 6.786e-24 849- 
880 BL00107B 1331 6.727e-13 
916-932 


670 


BL00299 


Ubiquitin domain proteins. 


BL00299 28.84'9.735e-27 37-89 


671 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 6.57 le- 12 432-475 


676 


PR00861 


ALPHA-LYTIC ENDOPEPTIDASE 
SERINE PROTEASE (S2A) 
SIGNATURE 


PR00861E 9.88 2.385e-09 206- 
221 


678 


BL00225 


Crystallins beta and gamma 'Greek key* 
motif proteins. 


BL00225B 18.06 7.51 7e-24 1805- 
1840 BL00225B 18.06 8.297e-20 
1987-2022 BL00225B 18.06 
2.575e-19 1896-1931 BL00225B 
18.06 8200e-19 175-210 
BL00225B 18.06 8.200e-19 1698- 
1733 BL00225B 18.06 4.808e- 14 
73-108 BL00225B 18.06 4.808e- 
14 1596-1631 BL00225B 18.06 
5.500e-14 2077-21 12 BL00225A 
13.82 5.829e- 12 2043-2064 
BL00225A 13.82 3.1276-09 1759- 
1780 


679 . 


PR00320 ^ » 


.G-PROTEIN BETArWD^O?REPEAT^^::i > 
SIGNATURE ' - . " 


:PR00320C 13.01 .4.240e- 10 169- 
"184 PR00320A 16.74 6.294e-10 
169-184 


680 


BL00243 


Integrins beta chain cysteine-rich domain 
proteins. 


BL002431 31.77 1.143e-Il 172- 
215 


681 


PR00852 


XERODERMA PIGMENTOSUM 
GROUP D PROTEIN SIGNATURE 


PR00852H 5.90 l.OOOe-29 612- 
635 PR00852E8.14 3.769e-27 
348-371 PR00852D 11.38 8.875e. 
27 309-331 PRO0852B 11.08 
2.800e-25 249-269 PR00852I 
17.26 3.500e-25 683-704 
PR00852F 1 1.85 5.909e-24 379- 
398 PR00852G 16.19 4.462e-23 
468-486 PR00852C8.81 9.143e- 
23 284-303 


682 


BL50058 


G-protein gamma subunit profile. 


BL50058 27.23 1.375e-35 15-63 


685 


BL00972 


Ubiquitin carboxyl-tenninal hydrolases 
femiiy 2 proteins. 


BL00972A 11.93 7.5 OOe-20 40-58 
BL00972D 22.55 3.903e-16 300- 
325 BL00972B 9.451.0006-13 
120-130 BL00972E 20.72 5.500e- 
11 325-347 


687 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 4.273e-14 98- 
138 


688 


BL00388 


Proteasome A-type subunits proteins. 


BL00388A 23.14 l.OOOe-40 8-54 
BL00388B 31.38 3.864e-33 66- 
108 BL00388D 20.71 l.OOOe-21 
153-184 BL00388C 18.79 8.147e- 
16 126-148 


689 


PD02796 


PROTEIN STEROL CARRIER LIPID- 


PD02796B 20.92 1.105e-15 347- 
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691 
f 692 


PD01572 
BL00028 


PHOTOSYSTEM H REACTION 
CENTRE T PROTEIN PHOTOS. 


394 

PD01572 8.77 4.083e-09 1-31 


694 


BL01013 


imger, K^ztiz type, aomain proteins. 
Oxysterol-binding protein family 
proteins. 


BL00028 16.07 7.600e-10 488-505 
BL01013A 25.14 9.3576-33 527- 
563 BL01013D 26.81 8.235e-23 
814-858 BL01013C 9.97 6,21 le- 
14 615-625 BL01013B 11.33 
3.605e-13 592-603 


695 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 


PD00289 9.97 3.571e-13 164-178 
PD0Q289 9.97 8.650e-l 1 2147- 
2161 PD00289 9.97 2.552e-09 23- 
37 


698 


PR00161 


NICKEL-DEPENDENT 
HYDROGENASE/B-TYPE 
CYTOCHROME SIGNATURE 


PR00161C 9.51 4.930e.09 282- 
302 


700 


PR00740 


Li 1 oUZ. I Mc <j SIGNATURE 


PR00749F 13.63 8.636e-13 139- 
156 PR00749H8.22 3.681e-12 
173-194 PR00749B 16.54 1.419e- 
1 1 48-70 PR00749C 7.26 3,060e- 
1172-91 PR00749A 10.33 
4.815e-10 24-45 


703 


PR00704 


CALPAIN CYSTEINE PROTEASE (C2) 
FAMILY SIGNATURE 


PR007041 9.52 l.OOOe-29 476-505 " 
PR00704D 11.05 2.500e-27 132- 
158 PR00704E 12.55 5.5006-27 
162-186 PR00704F 13.61 l.OOOe- 
22 187-215 PR00704G 13.87 
1.2376-21317-339 PR00704H 
13.38 8.138e-21 367-385 
PR00704A 14.68 2. 125e-19 27-5 1 
PR00704C 11.88 1.257e-17 96- 
113 PR00704B 17.94 I.833e-15 
72-95 


705 


PR00859 


PROKARYOTE METALLOTHIONEIN 

^TOM A TT TD "C 
olVJlNA 1 Ulvc 


PR00859C 7.06 2.776e-09 94-1 11 


706 


BL00226 


Intermediate filaments proteins. 


BL00226D 19.10 9.581e-26 369- " 
416 BL00226B 23.86 3.250e-24 
203-251 BL00226C 13.23 8.269e- 
21 268-299 BL00226A 12.77 
8.200e-14 103-118 


707 


PR00021 


SIGNATURE 


PR00021A 4.31 2.440e-10 2-15 


708 


BL00361 


Ribosomal protein SIO proteins. 


BL00361B 18.34 5.101e-10 82- 
105 


709 


PR00021 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


PR00021A4.31 2.200e-102-15 


710 


BL00514 


Fibrinogen beta and gatnma chains C- 
terminal domain proteins. 


BL00514C 17.41 8.412e-27 160- 
197 BL00514E 14.28 8.909e-16 
219-236 BL00514H 14.95 1.551e- 
15 317-342 BL00514G 15.98 
7.750e- 15 284-314 BL00514D 
15.35 4.789e-10 201-214 


711 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 8.714e-12 49-90 


714 
715 


BL00400 
BL01154 ] 


LBP / BPI / CETP family proteins. 

WA polymerases L / 13 to 16 Kd ] 


BL00400C 24.53 6.029e-17 158- 
202 BL00400D 23.26 2.080e-14 
222-259 BL00400A 21.59 1.600e- 
1027-59 

3L01 154B 24.55 5.500e-36 40-76 | 
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subunits proteins. 


BL01154A 18.70 3.000e-22 19-40 


716 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. . 


PD01066 19.43 9.786e-32 10-49 


717 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 9.206e-14 77- 
102 BL00215A 15.82 8.412e-10 
175-200 


719 


BL00309 


Vertebrate galactoside-binding lectin 
proteins. 


BL00309C 18.65 2.241e-09 62-87 


726 


BL00687 . 


Aldehyde dehydrogenases glutamic acid 
proteins. 


BL00687E 25.37 7.136e-33 266- 
316 BL00687D 26.00 5.333e-28 
15M98 BL00687B 17.54 3.647e- 
26 39-81 BL00687C 24.13 
6.087e-22 96-133 BL00687F 9.55 
2-500e-ll 352-363 


727 


DM01354 


kw TRANSCRIPTASE REVERSE II 
ORF2. 


DM01354N 13.17 l.OOOe-40 129- 
174 DM01354O8.73 6.605e-15 
180-226 


734 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM-BI. 


PD00301A 10.24 6.400e-09 101- 
112 


735 . 


BL01024 


Protein phosphatase 2A regulatory 
subunit PR55 proteins. 


BL01024A 10.26 l.OOOe-40 22-69 
BL01024B8,91 1.000e-4086rl27 
BL01024C 7.80 l.OOOe-40 146- 
185 BL01024D 13.22 l.OOOe-40 
185-222 BL01024E 11.96 l.OOOe- 
40 222-266 BL01024F9.42 
1 .OOOe-40 266-3 17 BL01024G 
11.09 l.OOOe-40 317-349 
BL01024H 13.88 LOOOe-40389- 
442 


736 


PF00913 


Trypanosome variant surface 
- giycoprotein. ' . « 


PF00913D 1 1.90 7.130e-10 24-51. 


737 


PR00700 


PROTEIN TYROSINE PHOSPHATASE 
SIGNATURE 


PR00700D 12.47 2200e-09 82* 
101 


740 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320C 13.01 1.600e-09 68-83 
PR00320A 16.74 7.366e-09 68-83 


743 


PR00871 


DNA 

NUCLEOTIDYLEXOTRANSFERASE 
(TDT) SIGNATURE 


PR00871G 14.48 8.000e-09 178- 
201 


745 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL005 18 12.23 2.286e- 10 33-42 


749 


BL00215 


Mitochondria] energy transfer proteins. 


BL00215A 15.82 5.2006-15 221- 
246 BL00215A 15.82 7.618e-14 
20-45 BL00215A 15.82 8.851e-ll 
123-148 BL00215B 10.44 9,526e- 
11 69-82 BL00215B 10.44 
7.300e-09 272-285 BL00215B 
10.44 8.500e-09 165-178 


751 


BL50002 


Src homology 3 (SH3) domain proteins 
profile. 


BL50002A 14.19 l.OOOe-14 370- 
389 BL50002B 15.18 2.200e-10 
408-422 


752 


BL00353 


HMGl/2 proteins. 


BL00353B 11.47 3.089e-12 390- 
440 


753 


PF00622 


Domain in SPIa and the RYanodine 
Receptor. 


PF00622B 21.00 4,214e-14 47-69 


754 


BL00211 


ABC transporters family proteins. 


BL00211A 12.23 8.941e-10 66-78 


755 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 7.750e-19392- 
415 PR00926G 16.07 5.935e-17 
253-274 PR00926D 10.53 2.059e- 
15 301-320 PR00926E 11.70 
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4.971e-15 344-363 PR00926B 
16.07 9.5266-13 210-225 
PR00926A 10.41 1.514e-12 197- 
211 


756 


BL01187 


^aicium-Dinaing liCiF-like domain 
proteins pattern proteins. 


BLOl 187A 9.98 2.125e-12 324- 
336 BLOl 187A 9.98 4.789e-l 1 
377-«389 BL01187B 12.04 3.057e- 
10 439-455 


757 


PF00651 


DID {mso Known as BR-cyxtk) domana 
proteins. 


PF00651 15.00 4.429e-10 43-56 


758 


PR00055 


HIV TAT DOMAIN SIGNATURE 


PR00055A 8.13 8.855e-09 144- 
156 


759 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 5.304e-ll 110-123 


760 

765 


PR00448 
BL01042 


NSF ATTACHMENT PROTEIN 
SIGNATURE** 

Homoserine dehydrogenase proteins. 


PR00448D 12.42 3.455e-27 162- 
186 PR00448A 10.74 1.273e-22 
37-57 PR00448B 16.01 9.379e-21 
100-118 PR00448C 1L46 l.OOOe- 
20 129-147 


766 


PR00625 


DNA J PROTEIN FAMILY 
SIGNATURE 


BL01042A 13.29 5.909e-ll 74-95 
PR00625A 12.84 2.1540-18 26-46 
PR00625B 13.48 9.000e-1657-78 


768 
769 


BL00762 
PR00709 


WHEP-TRS domain proteins. ~ 
AVIDIN SIGNATURE 


BL007f?7A 9^ H ^nno '^c t n 

i~>M^\f\j / ij^r\. z,o.HD o.jUUc-,Zc) il,id- 

149 BL00762B 16.14 3.793e-12 
64-78 BL00762A 23.43 6.625e-12 
6-43 BL00762C 15.58 4,176e-09 
459-472 BL00762D 11.15 9.667e- 
09210-220 


770 




G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00709A4.60 1.934e-09 1-20 
PR00320C 13.01 1. 7206-10262- 
277 PR00320A 16.74 2.853e-10 
262-277 PR00320C 13.01 4.300e- 
09 96-111 PR00320B 12.19 
5.500e-09 262-277 PR00320A 
16.74 6.268e-09 55-70 


771 


PR00019 


LEUCINE-RICH REPEAT 

Cm"MA TTTt>T3 


PR00019B 1 1.36 8.7146-12 87- 
101 PR00019A 11.19 l.OOOe-10 
90-104 


772 


PD02807 


APOLIPOPROTEIN E PRECURSOR 
APO-E GLYCOPROTEIN PLAS. 


PD02807C 8.91 6.308e-10 110- 
159 


773 


PD02807 


APOLIPOPROTEIN E PRECURSOR 
APO-E GLYCOPROTEIN PLAS. 


PD02807C 8.91 6.308e-10 155- 
204 


774 


DM00547 


1 kw CHROMO BROMODOMAIN 
SHADOW GLOBAL, 


DM00547F 23,43 3.942e-28 943- 
990 DM00547E 13.94 9.750e-21 
652-675 DM00547B 11.28 
1.818e-18 518-532 DM00547C 
17.30 3.531e-17 546-568 
DM00547A 12.38 1.273e-ll 497- 
509 DM00547D 11.60 9.2006-11 
622-636 


776 


PR0077Q 


UN UM 1 uiw i ,4,5- 1 KISPHO SPHATE- 
BINDING PROTEIN RECEPTOR 
SIGNATURE 


PR00779F 14.51 5.147e-09 769- 
792 


777 


PR00779 

] 


[NOSITOL 1,4,5-TRISPHOSPHATE- 
BINDING PROTEIN RECEPTOR 
SIGNATURE 


765 


778 


PR00779 J 

J 


INOSITOL l,4,^-'iRlSPH0SPHATE- J 
BINDING PROTEIN RECEPTOR 
SIGNATURE 


?R00779F 14.51 5.I47e-09 742- 
765 



177 



wo 01/57190 



PCT/USOl/04098 



SEQ 
ID 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


779 


BL01282 


BIR repeat proteins. 


BL01282B 30.49 2.543e-09 6-45 


781 


PR00205 


CADHERIN SIGNATURE 


PR00205B n.39 3.118e-ll 654- 
672 PR00205B n.39 8.588e-l 1 
230-248 PR002a5B 1 L39 8.527e- 
10 551-569 PR00205B 11.39 
4.203e-09 336-354 


783 


BL00625 


Regulator of chromosome condensation 
(RCCl) proteins. 


BL00625B 17.69 2.167e-19 193- 
227 BL00625A 16.21 5.500e-17 
199-228 BL00625B 17.69 I.885e- 
16 140-174 BL00625B 17.69 
2.770e-16 245-279 BL00625A 
16.21 9.1 15e-16 251-280 
BL00625A 16.21 6.507e-14 146- 
175 


785 


PF00084 


Sushi domain protems (SCR repeat 
proteins. 


PF00084B 9.45 7.188e-10 595-607 
PF00084B.9.45 6.400e-09 656-668 


786 


PF00084 


Sushi domain protems (SCR repeat 
protems. 


PF00084B 9.45 7.188e-10 595-607 
PF00084B 9.45 6,400e-09 656-668 


787 


BL00826 


MARCKS family proteins. 


BL00826C 7.63 6.73 8e-^9 203- 
230 


788 


PR00453 


VON WILLEBRAND FACTOR TYPE 
A DOMAIN SIGNATURE 


PR00453A 12.79 1.310e-14 36-54 

PR00453B 14.65 8.568e-10 75-90 


789 


PR00102 


ORNITHINE 

CARBAMOYLTRANSFERASE 
SIGNATURE 


PR00102B 14.82 5.418e-09 963- 
977 


790 


BL00030 


Eukaryotic RNA-binding region RNP-1 
proteins. 


BL00030B 7.03 5.500e-l 1 199- 
209 


791 


BL00415 

• 


Synapsins proteins. 


BL00415N 4.29 9.519e-10 393- 
437 BL00415N 4.29 2.1176-09 
103-147 BL00415N4.29 3.628e- 
09 97-:14r BL00415N4.29 
5.664e-09 387-431 


795 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.091e-36 105-144 


799 


PF00731 


AIR carboxylase. 


PF00731C 23.16 7.333e-35 337- 
380 PF00731B 19.47 7.429e-28 
299-336 PF00731A j9.32 6.333e- 
24 268-297 


804 


BL00170 


Cyclophilin-type peptidyl-prolyl cis-trans 
isomerase signatur. 


BL00170B 20.97 8.071e-09 297- 

337 


805 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 3.400e-10 378-389 
BL00678 9,67 5.8006-10418-429 
BL00678 9.67 8.800e-l 0295-306 


806 


PD01719 


PRECURSOR GLYCOPROTEIN 
SIGNAL RE. 


PD01719A 12.89 7.571e-14 290- 
318 


807 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320B 12.19 9.100e-09451- 
466 


809 


BL00107 


Protem kinases ATP-binding region 
proteins. 


BL00107A 18,39 4.462e-12 564- 
595 


810 


PR00453 


VON WILLEBRAND FACTOR TYPE 
A DOMAIN SIGNATURE 


PR00453A 12.79 L310e-14 36-54 
PR00453B 14.65 8.568e-10 75-90 


814 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.047e-31 16-55 


815 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.047e-31 16-55 


817 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 5.154e-36 125- 
154 PR00193E 19.47 3.9l9e-18 
179-208 


818 


PR00830 


ENDOPEPTEDASE LA (LON) SERINE 


PR00830A8,41 9.571e-ll 115- 
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819 




3 '5 -cyclic nucleotide phosphodiesterases 
proteins. 


135 

BL00126C 22.07 7.857e-24 528- ' 
569 BL00126E3522 3.714e.l5 
669-724 BL00126D 25.50 1.173e- 
14 584-623 BL00126B 15.20 
l.OOOe-12 502-514 BL00126A 
27.56 3.36Ie-09 461-498 


820 


PR00511 


TEKTIN SIGNATURE 


PR0051 IB 12.25 8.826e-22 174- " 
195 PR00511A 13.59 7.723e-ir 
155-172 


821 


BL00741 


vjudnme-nucieonae dissociation 
stimulators CDC24 family sign. 


BL00741B 14.27 2.800e-15 13-36 


822 


PF0O780 


i/omain louna in NlKl-like kinases, 
mouse citron and yeast ROM. 


PF007801 14.69 4.825e-09 23 1- 
261 


827 


BL00030 


Eukaryotic RNA-binding region RNP-1 
proteins. 


BL00030A 14.39 5.235e-ll 144- 
163 


828 


BL00326 


Tropomyosins proteins. 


BL00326D 8.76 9.357e-l 1 545- * 
586 


829 


PD02448 


TRANSCRIPTION PROTEIN DNA- 
BINDIN. 


PD02448A 9.37 hOOOe-40 46-85 ^ 
PD02448B 10.17 l.OOOe-40 85- 
133 PD02448C 13.62 l.OOOe-40 
152-189 PD02448E 1 1.33 9.000e- . 
30235-261 PD02448F 14.22 
9.654e-25 279-303 PD0244SD 
1 1.48 3.659e-18 197-211 
PD02448G 10.73 7.857e-16 305- 
318 


830 


BL00720 


Guanine-nucleotide dissociation 
stimulators CDC25 family sign. 


BL00720B 16.57 4.500e-23 483- 
507 


83i 

832 


BLOOIft? 
BL00215 


rroiem Kmases A i F-binding region 
proteins. 

Mitochondrja] energy transfer proteins 


BL00107A 18.39 6,625e-21 143- 
174 BL00107B 13.31 4.214e-10 
213-229 


833 


PR004Q7 


iMiU IKUJ:*!!!!^ CY lUSOL FACTOR 
P40 SIGNATURE 


BL00215A 15.82 5.787e-l] 32- JT" 
PR00497A 6.92 4.3 75e-09 41-59 


834 




Tau and MAP proteins tubulin-binding 
domain proteins. 


BL00229A 23.57 9.565e-10 99- 
138 


835 
~836 


BL00421 


Transmembrane 4 family proteins. 


BL00421E 20.97 22166-09 1053- 
1083 




BL00795 


Involucrin proteins. 


BL00795B 12.41 7.93 le-09 405- 
445 


837 




MAM DOMAIN SIGNATURE 


PR00020A 18.17 I.OOOe-17 34-53 
PR00020B 15,52 5M6e-16 68-85 
PR00020D 12.70 2.543e-15 147- 
162 PR00020C 13.66 3.483e.l3 
95-107 PR00020E8.64 6.586e-13 
165-179 


838 


BL50017 


Death domain proteins profile. 


BL50017B 17.60 6.897e-13 1499- 
1515 


~839 


PF00850 


Histone deacetylase femily. 


FF00850C 14.55 9.542e-09 1352- 
1369 


840 


PF00023 


Ank repeat proteins. 

. ] 
] 


PF00023A 16.03 4.500e-I2 44-60 
PF00023B 14.20 7.923e-ll 73-83 
^F00095R 14 9ft 0 ftftfto in no 

149 PF00023B 14.20 5.500e.09 
10-50 


842 


BL01194 ] 


<i bosomal protein L 1 5e proteins. ] 

1 
] 


3L01 194B 13.66 l.OOOe-40 37-85 
5L01194C 12.35 9250e-40 103- 
[38 BLOl 194A 18.70 7.632e-38 
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2-37 BL01194D 19.02 2.658e-36 

139-178 


843 


BL00610 


Sodiumrneurotransmitter symporter 
family proteins. 


BL00610A 17.73 1 .OOOe^O 40-90 
BL00610B 23.65 l.OOOe-40 104- 
154 BL00610C 12.94 l.OOOe-40 
206-258 BL00610E 20.34 l.OOOe- 
40 355-398 BL00610F 29,02 
l.OOOe-40 454-509 BL00610D 
20.97 6.0636-35272-325 
BL00610G 12.89 8.588e-13 514- 
537 


845 


BLOOMS 


Insulinase family, zinc-binding region 
proteins. 


BL00143A 20.91 4.300e-20 94- 
121 BL00143C 14.16 5.500e-13 
245-258 BL00143B 14.41 9.053e. 
10 141-156 


846 


PR00543 


OESTROGEN RECEPTOR 
SIGNATURE 


PR00543D 10.87 1.355e-09 898- 
914 


847 


PR00543 


OESTROGEN RECEPTOR 
SIGNATURE 


PR00543D 10.87 1.355e-09 898- 
914 


848 


BL00824 


Elongation factor 1 beta/betaVdelta chain 
proteins. 


BL00824C 14.58 l.OOOe-40 129- 
167 BL00824D 14.04 6.1926-39 
167-202 BL00824B 9.21 2.080e- 
2196-116 BL00824E 12.49 
3.3336-19 210-226 BL00824A 
13.78 8.6506-14 19-34 


849 


FD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 l.OOOe-40 12-51 


850 


PP01066 


PROTEIN ZmC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.316e-24 10-49 


852 


BL01272 


Glucokinase regulatory protein family 
"proteins. » . * 


BL01272B, 19.61 6.870e-30 136- 
171 BL01272C1 1.68 3.3 14er25 
249-274 BL01272A6.49 1.23 le- 
18 99-117 


853 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 9.341e-20 65- 
106 


854 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 6.8506-11 140-154 


858 


PR00450 


RECOVERIN FAMILY SIGNATURE 


PR00450C 12.22 3.250e-25 68-90 
PR00450B 11.76 8.1256-2322-42 
PR00450D 16.58 8.920e-22 92- 
112 PR00450E 12.14 1.581e-19 
114-133 PR00450G 15.33 5.500e- 
19 166-187 PR00450F 12.30 
4.3756-15 140-156 PR00450A 
13.581.8576-148-23 


860 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 7.188e-27 74-1 17 


866 


BL00477 


Alpha-2-macroglobulin family thiolester 
region proteins. 


BL00477L 23.51 7.480e-20 54-87 


867 


BL01078 


Molybdenum cofactor biosynthesis 
proteins. 


BL01078B 14.20 1.6216-20 408- 
429 BLQ1078A 10,16 2.000e-13 
366-379 BL01078D 5.99 3.455e- 
11 566-576 BL01078C 10.52 
3.7936-11501-513 


868 


BL01177 


Anaphylatoxin domain proteins. 


BLOl 177E 20.64 5.800e-24 462- 
489 BL01177C 17.39 5.333e-19 
416-435 BL01177B 13.61 7.840©- 
16 122-138 BL01177D 17.50 
1.9006-15441-459 


869 


BL01177 


Anaphylatoxin domain proteins. 


BLOl 177E 20.64 5.800e-24 415- 
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442 BL0H77C 17.39 5.333e-19 " 
369-388 BL01177B 13.61 7.840e- 
16 122-138 BL01177D 17.50 
1.900e-15 394-412 


871 


BL50007 


rnospxianayiinositoi-speciiic 
phospholipase X-box domain proteins 
pro£ 


BL50007A 19.61 l.OOOe-40322- 
368 BL50007D 19.54 LOOOe-40 
589-631 BL50007B 20.90 6.700e- 
36 383-421 BL50007E 25.63 
9.053e-33 748-785 BL50007C 
8.97 5.200e-19 452-469 


872 


BL00972 


Ubiquitin carboxyl-tenninal hydrolases 
family 2 proteins. 


BL00972D 22.55 3.250e-17 90- 
115 


874 


PR00452 


SID DOMAIN SIGNATURE 


PR00452B 1 1.65 4.250e-09 370- " " 
386 


S77 
878 


BL00741 
DM00215 


Guanine-nucleotide dissociation 
stimulators CDC24 family sign. 
PROLINE-RICH PROTEIN 3. 


BL0074IB 14.27 5.500e-13 1343- 
1366 


881 


PD02807 


APOLIPOPROTEIN E PRECURSOR 
APO-E GLYCOPROTEIN PLAS 


DM00215 19.43 2.525e-09 52-85 
PD02807E 10.90 4 J02e-09 358- 
407 


882 
885 


PD01066 

PF00023 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 
Ank repeat proteins. 


PD01066 19.43 7.188e-37 8-47 


886 


PR00372 


BIOPTERIN-DEPENDENT 
AROMATIC AMINO ACID 
HYUKOaYLASE SIGNATURE 


PF00023A 16.03 8.071e-09 10-26 
PR00372B 10.30 9.308e-27225- 
248 PR00372A 13.39 7.000e-24 
134-154 PR00372E 12.62 2.125e- 
23 360-380 PR00372C 7.90 
3.025e-22 289-309 PR00372F 
13.09 6.333e-21 395-414 
PR00372D 10.22 l.OOOe-19329- 
348 


887 


BL00301 


GTP-binding elongation tactors proteins. 


BL00301B 20.09 2.800e-24 103- 
135 BL00301A 12.41 4.3 16e-13 
21-33 


888 


BL00518 


z.inc imger, v^itikA type (RING finger), 
proteins. 


BL00518 12.23 1.667e-09 30-39 


889 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 4.906e-26 6-45 


890 


DM00179 


w KINASE ALPHA ADHESION T- " ' 


DM00179 13.97 7.652e-09 113- 
123 


892 


BL01022 


PTR2 family proton/oligopeptide 
d/mpuricrs proiGUis. 


BL01022B 22.19 6.016e-14 72- 
118 BL01022E 23.51 1.173e-12 
472-508 BL01022A 11.58 9.135e- 
12 42-61 BL01022D 9.42 3.455e- 
11 199-212 


893 


PD02407 


3-BlSFHOSPHOGLYCERATE- 
INDEPENDENT PHOSPHOGL YCER. 


PD02407K 12.59 6.529e-10 360- 
383 


894 


PD02407 


3-BiSPHOSPHOGLYCERATE- 
INDEPENDENT PHOSPHOGL YCER. 


FD02407K 12.59 6.529e-l 0 360- 
383 


895 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 

1 


PR00237B 13.50 9,100e-14 116- 
138 PR00237F 13.57 1.360e-13 
312-337 PR00237G 19.63 9.069e- 
13 353-380 PR00237E 13.03 
7.1206-12 243-267 PR00237D 
5.94 4.150e-ll lO^-^lfi 

^R00237AIL48 4.375e-1183- 
108 


896 


BL00129 ( 


jiycosyi hydrolases femily 31 proteins. 1 

t 

3 


3L00129D 16.76 8.258e-26 634- 
>78 BL00129A 26.21 1.720e-25 
84-430 BL00129E 22.60 4.857e- 
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23 698-734 BL00129C 15.12 
1.750e-22 596-624 BL00129B 
19.19 5.891e-18 495-522 
BL00129F 26.19 7J45e-I5 814- 

852 


897 


BL00598 


Chromo domain proteins. 


BL00598 14.45 1.220e-13 9-31 


898 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 6.000e-09 396-405 


899 


PDOllOl 


INHIBITOR HEAVY CHAIN 
CHANNEL IN. 


PDOIIOIB 21.53 l.OOOe-40 274- 
327 PDOIIOID 24.45 l.OOOe-40 
457-512 PD03 lOlA 18^25 6.268e- 
23 83-117 PDOIIOIC 12.69 
1.237e-16 366-386 PDOllOlE 
6.73 7.750e-12 566-576 


900 


PR00600 


PROTEIN PHOSPHATASE PP2A 55KD 
REGULATORY SUBUNIT 
SIGNATURE 


PR00600A 11.61 5,979e-09 31-52 


901 


EDO 1066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 8.116e-31 24-63 * 


903 


BL01I15 


GTP-binding nuclear protein ran proteins. 


BL01115A 10.22 1.509e-ll 21-65 


906 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.174e-13 539- 
572 DM00215 19.43 4.750e-12 
549-582 DM00215 19.43 9.824e- 
11 551-584 DM00215 19.43 
2.929e- 10 548-581 DM00215 
19.43 4.054e- 10 550-583 
DM00215 19.43 5.339e-10 552- 
.585 DM00215 19.43 7.1076-10 
544-577 


907 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 6.276e-12 314- 

"332 


908 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 1 8.39 5.950e-17 1 125- 
1156 


909 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 5.950e-17 1118- 
1149 


910 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 8.560e-13 150- 
181 


911 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 8.560e-13 150- 
181 


912 


PF00856 


SET domain proteins. 


PF00856A 26.14 4.553e-ll 243- 
280 


913 


PF00628 


PHD-finger. 


PF00628 15.84 6.400e-13 197-212 


914 


PR00962 


LETHAL(2) GLWT LARVAE 
PROTEIN SIGNATURE 


PR00962D 10.40 l.OOOe-27 435- 
459 PR00962G 15.71 4.086e-26 
593-618 PR00962B 11.98 9,122e- 
26 296-319 PR00962A 13.28 
6.143e-22 15-34 PR00962C 8.00 
4.000e-21 348-369 PR00962F 
12,39 9.769e-21 552-572 
PR00962H 13,32 2.636e-20 623- 
643 PR009621 1 1.68 9.786e-20 
692-712 PR00962E8.812.915e- 
18 515-534 


915 


PR00962 


LETHAL(2) GL\NT LARVAE 
PROTEIN SIGNATURE 


PR00962D 10.40 l.OOOe-27 365- 
389 PR00962G 15.71 4.086e-26 
523-548 PR00962A 13.28 6.143e- 
22 15-34 PRO0962C8.OO4.000e- 
21 278-299 PR00962F 12.39 
9.769e-21 482-502 PR00962H 
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MJJ 

NO: 
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13.32 2.636e-20 553-573 
PR009621 1 1.68 9.786e-20 622- 
642 PR00962E8.81Z915e-18 
445-464 


916 


BL00134 


Serine proteases, trypsin family, histidine 
proteins. 


BL00134A 11.96 5.8866-14 90- " 
107 


917 


BL00478 


LIM domain proteins. 


BL00478B 14.79 8.393e.l3 211- 
226 BL00478B 14.79 6.7 12e- 10 
271-286 


918 
922 


PR00049 
BL00150 


WELM'S TUMOUR PROTEIN 
SIGNATURE 
Acylphosphatase proteins. 


PR00049D 0.00 5.729e-09 973- 
988 


924 


DM00031 


IMMUNOGLOBULIN V REGION. 


BL00150 25.33 l.OOOe-40 37-84 
DM0003 IB li.41 8.063e-09 79- 
113 


925 


BL00072 


Acyl-CoA dehydrogenases proteins. 


BL00072D 30.08 2.837e-24 280- " 
33 1 BL00072E 24. 12 8.200e-24 
368-41 1 BL00072C 25.30 7.873e- 
20 226-267 BL00072B 9.48 
6.049e-12 183-196 


927 


BL0Q237 


G-protein coupled receptors proteins. 


BL00237C 13.19 1.692e-13 229- 
256 BL00237A 27.68 6.657e-13 
90-130 BL00237D 1U3 9.571e- 
13 290-307 


928 


BL01033 


Globins profile. 


BL01033A 16.94 7.923e-18 25-4T" 
BL01033B 13.81 l.OOOe-15 93- 
105 


929 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 8.714e-13 203- " 
253 


932 


BL00415 


Synapsins proteins. 


BL00415N4.29 9.519e-10 353- 
397 BL00415N429 2.117e-09 
63-107 BL00415N4.29 3.628e-09 
57-1 0 1 BL004 1 5N 4.29 5.664e-09 
347-391 


933 


PD02448 


TRANSCRIPTION PROTEIN DNA- 
BINDIN. 


PD02448A 9.37 1 .OOOe-40 46-85 ~ 
PD02448B 10.17 l.OOOe-40 85- 
133 PD02448C 13.62 l.OOOe-40 
152-189 PD02448E 11.33 9.000e- 
30 223-249 PD02448F 14.22 
9.654e-25 267-291 PD02448D 
1 1.48 3.659e-18 197-211 
PD02448G 10.73 7.857e-16293- 
306 


934 




w SPAC8A4.04C RESISTANCE 
SPAC8A4.05C DAUNORUBICIN. 


DM00191D 13.94 9.083e-10 136- 
175 


935 


BL01115 


GTP-binding nuclear protein ran proteins. 


BLOl 1 15A 10.22 4.696e-10 67- 
111 


936 


BL00019 


Actinin-type actin-binding domain 
proteins. 


BL00019D 15.33 8.138e-14 865- 
895 


937 

938 
939 


FR00762 

BL00027 
DM01111 I 


CHLORIDE CHANNEL SIGNATURE 

: 

Homeobox' domain proteins. ] 
icw PHOSPHATASE j 


PR00762A 14.22 4.000e-22 183- 
201 PR00762C 9.291.0006-21 
268-288 PR00762E 12.07 3.250e- 
20 520-537 PR00762D 11.29 
l.OOOe-1 9 470-491 PR00762F 

PR00762B 12.12 1.818e-18214- 
234 PR00762G 14.13 3.4556-17 

>77-592 

3L00027 26.43 9.500e-25 291-334 
:>MQ\ 1 1 IE 17.28 1.568e-10 248- 
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RESULTS* 






TRANSFORMING 61K PDF!. 


297 DMOllllE 17.28 5.168e-10 
659-708 DM01 11 ID 16.76 
5.263e-09 279-325 DM01 lllM 
10.67 8.674e-09 911-935 


940 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107B 13 J 1 l.OOOe-14293- 
309 BL00107A 18.39 6.760e-13 

229-260 


942 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 9.832e-ll 543- 

597 


943 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 3.500e-35 8-47 


945 


BL00989 


Clatfarin adaptor complexes small chain 
proteins. 


BL00989B 26.5 1 1 .OOOe-40 66- 
117 BL00989A 11.66 l.OOOe-13 
5-19 


946 


PR00178 


FATTY ACID-BINDING PROTEIN 
SIGNATURE 


PR00178D 13.52 9.571e-09 450- 
469 


947 


BL00178 


Aminoacyl-transfer RNA synthetases 
class-I proteins. 


BL00178B 7.11 4.857e-09713- 
724 


948 


PF00628 


PHD-finger. 


PF00628 15.84 8.4 12e- 14 201-216 


951 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 2.050e-10 180- 
230 


952 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 4.300e-ll 26-49 
PR00926F 17.75 6.348e-09 134- 
157 


955 


PF00109 


Beta-ketoacyl synthase. 


PF00109 13.08 2.846e-12 342-357 


957 


PR00069 


ALDO-KETO REDUCTASE 
SIGNATURE 


PR00069A 16.01 8.826e-24 26-51 
PR00069B 11.33 1.514e-17 86- 
105 PR00069C 16.03 8.816e-14 
155-173 


.958.. 


, PF00583 - . 


Acetyltransferase (GNAT) family. 


PF00583A 12.53 5.500e-10 631- - 

1542_-. / ; . 


961 


' PR00328 


GTP-BINDING SARI PROTEIN 
SIGNATURE 


PR00328A 10.62 8.740e-10 7-31 


962 


BL00354 


HMG-I and HMG-Y DNA-bindmg 
domain proteins (A+T-hook). 


BL00354A3.83 9.438e-10 1489- 
1499 


963 


BL00354 


HMG-I and HMG-Y DNA-binding 

domain proteins (A+T-hook). 


BL00354A 3.83 9.438e-10 1489- 
1499 


964 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 7.188e-27 53-96 


965 


. PF00992 


Troponin. 


PF00992A 16.67 2.421e-09 581- 
616 


966 


PR00515 


5-HYDROXYTRYPTAMINE IF 

RECEPTOR SIGNATURE 


PR00515D 7.91 5.7416-09 13-33 


967 


BL00579 


Ribosomal protein L29 proteins. 


BL00579B 21.99 5.065e-21 164- 
194 


970 


BL00504 


Fumarate reductase / succinate 
dehydrogenase FAD-binding site 
proteins. 


BL00504C 18.68 2.227e-24 34-59 
BL00504D 10.43 7.261e-21 75-93 


973 


PF00580 


UvrD/REP helicase. 


PF00580A 13.37 4.720e-09 249- 
271 


974 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456F5.86 l.OOOe- 10 242-254 


975 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27,68 4.429e-22 99- 
139 


976 


BL00031 


Nuclear hormones receptors DNA- 
binding region proteins. 


BL00031A 19.55 7.158e-33 60-93 
BL00031B 22^5 5.500e-28 94- 
126 


977 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 8.200e-16 196-209 
^PD00066 13.92 8.200e- 16 336-349 
PD00066 13.92 2.385e-15 476-489 
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PD00066 13.92 9.308e-15 252-265 
PD00066 13.92 2.8006-14448-461 
PD00066 13.92 4.6006-14 392-405 
PD00066 13.92 5.2006-14280-293 
PD00066 13,92 4.000e-13 224-237 
PD00066 13.92 4.429e-12 308-321 
PD00066 13.92 9.571 e-12 420-433 
PD00066 13.92 6.8706-11 168-181 


978 


BL00721 


Formate-tetrahydrofolate ligase proteins. 


BL00721B 13.21 1.0006-40 346"^ — 
401 BL00721D 13.90 l.OOOe-40 
53S-592 BL00721E 13.46 l.OOOe- 
40 597-646 BL0072 11 18.79 
2.500e-40 814-860 BL00721H 
21.20 8.2396-39 763-814 
BL00721A 15.31 9.719e-32 287- 
321 BL00721C 16.924.0006-30 
498-535 BL00721F 15.96 8.232e- 
27 660-702 BL00721G 7.97 
3.017e-10 721-734 


981 


PD00126 


PROTEIN REPEAT DOMAIN TPR 
NUCLEA. 


PD00126A 22.53 2.552e-09 180- 
201 


982 
983 


BL00869 
PR00196 


Renal dipeptidase proteins. 
ANNEXIN FAMILY SIGNATURE 


BL00869C 12.58 3.1726-19 59-95 " 
BL00869E 13.12 9.129e-18 120- 
157 BL00869J 15.60 6.032e-17 
270-310 BL00869H 11.08 1.840e- 
16 219-242 BL00S69G 13.55 
2.543e- 1 6 1 14 RT nn h^qv 
12.77 7.0316-14157-192 
BL008691 12.92 3.274e-12 242" 
270 BL00869D 14 02 5 iSOt^ya 
95-124 BL00869B 15.55 9.382e- 
10 31-61 


984 


BL00485 


Adenosine and AMP deaminase proteins. 


PR00196F 13.892.1256-09 92-108 
BL00485D 30.82 2.427e-10 154- 
209 



s^uie " """'^ P-^*"""' of signature in amino acid 



TABLE 4 



SEQID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


2 


ig 


Immunoglobulin domain 


3.9e-17 


60.3 


3 


HSP90 


Hsp90 protein 


0 


1548.4 


6 


tsp 1 


Thrombospondin type 1 domain 


0.002 


22.1 


7 


7tm_l 


7 transmembrane receptor (ihodopsin 
family) 


6.7e-08 


27.3 


9 


PWWP 


PWWP domam 


8.16-16 


66.0 


12 


Clq 


Clq domain 


1.7e-26 


101.5 


13 


Clq 


Clq domain 


2e-20 


81.3 


14 

15 


Aa_trans 
E1-E2 ATPase 


Transmembrane amino acid 
transporter protein 
lil-E2 ATPase 


2.7e-42 


153.9 


16 
]7 
18 
20 


trypsin 

lectin c 
Alpha_L_fucos 


Trypsin 

Immunoglobulin domain 
Lectin C-type domain 
Alpha-L-flicosidase 


6.36-124 

1.26-87 

7.6e-12 

0.0003 

1.2e-217 


412.2 

278.6 

43.2 

21.2 

736,5 
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NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


22 


pkinase 


Eukaryotic protein kinase domain 


3.3e-87 


303.1 


23 


pkinase 


Eukaryotic protein kinase domain 


2.7e-85 


296.8 


24 


pkinase 


Eukaryotic protein kinase domain 


2,7e-85 


296.8 


25 


ank 


Ank repeat 


5-5e-14 


59.9 


27 


pkinase 


Eukaryotic protein kinase domain 


L5e-100 


347.4 


28 


spectrin 


Spectrin repeat 


4e-57 


203.2 


29 


spectrin 


Spectrin repeat 


4e-57 


203.2 


30 


WD40 


WD domain, G-beta repeat 


1.2e-07 


38.8 


33 


mn 


KNA recognition motif. 


l.le-17 


72.2 


34 


rnn 


RNA recognition motif. 


l.le-17 


72.2 


36 


7to_l 


7 transmembrane receptor (liiodopsin 

family) 


3e-36 


117.3 


37 


ank 


Ank repeat 


5.9e-25 


96.3 


38 


SRF-TF 


SRF-type transcription factor 


1.4e-36 


133.9 


40 


alk_j)hosphatase 


Alkaline phosphatase 


0 


1034.9 


44 


zf-C2H2 


Zinc finger, C2H2 type 


8.6e-103 


354.9 


45 


sugar tr 


Sugar (and other) transporter 


3.1e-08 


40.3 


47 


7tm_2 


7 transmembrane receptor (Secretin 
family) 


6.4e-79 


275.6 


50 


zf-C2H2 


Zinc finger, C2H2 type 


1.3e-98 


341.0 


51 


filament 


Intermediate filament proteins 


L2e-176 


600.3 


52 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


2.7e-10 


37.7 


53 


Cadherin_C_ter 
m 


Cadherin cytoplasmic region 


1.9e-94 


327.2 


54 


S_100 


S-lOO/ICaBP type calcium binding 
domain 


5.2e-18 


73.3 


58 


inositol P 


Inositol monpphosphatase family 


5e-13 


49.8 


59 


7tm_l 


7 transmembrane receptor (rhodopsin 
family) 


8.8e-46 


147.6 


60 


Kiuiitz-gPTI - 


Kunitz/Bpvine pancreaitic.trypsm - " 
inhibito ; :' '-v -r.,.^: . , 


.3.7e^7 • r- 


148.6 


62 


DAD 


DADfemily 


2.5e-74 


260.3 


63 


MOZ SAS 


MOZ/SAS family 


5.9e-133 


455.1 


64 


MOZ_SAS 


MOZ/SAS family 


L7e-123 


423.6 


65 


ras 


Ras family 


9.3e-89 


308,3 


67 


Hamlp^like 


Haml family 


3.7e-49 


176.7 


68 


7tm_l 


7 transmembrane receptor (rhodopsin 
family) 


5.2e-39 


126.1 


70 


zf-C2H2 


Zinc finger, C2H2 type 


1.5e-112 


387.3 


71 


Peptidase_M41 


Peptidase family M41 


1.2e-110 


381.0 


72 


abhydrolase 


alpha/beta hydrolase fold 


9.8e-05 


26.5 


81 


K tetra 


K+ channel tetramerisation domain 


0.022 


-16.8 


82 


pkinase 


Eukaryotic protein kinase domain 


5e-49 


176.3 


84 


AAA 


ATPases associated with various 
cellular act 


L3e-77 


271.3 


85 


homeobox 


Homeobox domain 


1.4e-28 


108.3 


87 


TGF-beta 


Transforming growth factor beta like 


6Je-68 


210.2 


91 


mito__carr 


Mitochondrial carrier proteins 


4.6e-57 


198.5 


95 


adenylatekinase 


Adenylate kinase 


l.le-15 


60.0 


96 




finmunoglobulin domain 


4.Ie-20 


69.8 


99 


CNH 


CNH domain 


3.4e-120 


412.7 


100 


homeobox 


Homeobox domain 


7.4e-32 


119.3 


101 


zf-C2H2 


Zinc finger, C2H2 type 


2.2e-47 


170.8 


102 


zf-C2H2 


Zinc finger, C2H2 type 


4.4e-89 


309.4 


103 


dynamin 


Dynamin family 


1.4e-150 


513.6 


104 


lectin c 


Lectin C-type domain 


4.2e-15 


63.6 


105 


lectin c 


Lectin C-type domain " 


4.2e-15 


63.6 


108 


metalthio 


Metallothionein 


2e-25 


97.9 
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PFAM NAME 


DESCRBPTION 
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PFAM 
SCORE 


1 19 




Hspzu/aiptia crystaiJin family 


2.6e.20 


77.7 


L ID 




Elongation fector TS 


3.8e-63 


221.1 




sugar_tr 


Sugar (and other) transporter 


4e-63 


223.1 


lift 
HO 


catalase 


Catalase 


0 


1158.9 


1 10 


TTpTT 


Ubiquitin carboxyl-terminal 
hydrolase, famil 


le-10 


24.4 


122 


metalthio 


Metallothionein 


2.8e-25 


97.4 


17^ 


&dh short 


snort chain dehydrogenase 


1.6e-45 


164.6 




VP AH 

lvf\Ar> 


KKAJB box 


7.9e-25 


95.9 


177 
iz / 


G-alpha 


G-protein alpha subunit 


le-249 


843.0 


128 


mito can* 


Mitochondrial carrier proteins 


2e-65 


227.2 


1 "11 




EF-1 guanine nucleotide exchange 
domain 


4.9e-53 


189.6 






GYF domain 


4.9e-28 


106.6 


133 


GYF 


GYF domain 


4.9e-28 


106.6 


1 Qyl 


lipocalin 


Lipocalin / cytosolic fatty-acid 
binding pr 


2.1e-33 


119.1 




pkinase 


Eukaiyotic protem kinase domain 


3.3e-86 


299.8 




anJc 


Ank repeat 


2.2e-29 


111.1 




TT O 
1L8 

: 


Small cytokines 
(intecrine/chemokine), inter 


3.1e-18 


65.2 


1 ^0 


pyndoxal_deC 


Pyndoxal-dependent decarboxylase 
conse 


0.00011 


19.0 


140 


cadherin 


Cadherin domain 


1.3e-88 


307.8 




efhand 


EFhand 


5.7e-33 


123.0 


lA^ 


Acyltransferase 


Acyltransferase 


2e-29 


111.2 




cytochrome_c 


Cytochrome c 


1.7e-33 


124.7 


147 


pkinase 


Eukaryotic protein kinase domain 


2.3e-86 


300.3 




PDZ . 


PDZ domain (Also knovm as DHR or 
GLGF). 


1.7e-09 


45.0 


149 - 


; aldo -ket red. 


Aldo/keto reductase family 


7.4e-lS9 


640.% 




homeobox : . 


Homeobox domain 


3.2e-08 


38.7 


K1 

IDI 


PseudoU_synth 
1 


tRNA pseudouridine synthase 


4.7e-57 


203.0 


1 S7 


abhydrolase 


alpha/beta hydrolase fold 


1.7e-31 


118.0 


1 




PDZ domain (Also known as DHR or 
GLGF), 


l.le-09 


45.6 


156 


PHD 


PHD-finger 


7.6e-15 


62.8 


1 S7 


1113 


Pibronectm type ill domain 


0.015 


21.9 


1 ^ft 


iiomeobox 


Homeobox domain 


2.7e-27 


104.1 


1 Oli 




PWI domain 


3.9e-24 


93.6 


162 


DnaJ 


DnaJ domain 


2e-06 


34.8 


lO'f 




CBL proto-oncogene N-tenninal 
domam 


8e-117 


401.5 


7/?/; 


metalthio 


Metallothionein 


3.1e-26 


100.6 


167 


LRR 


Leucine Rich Repeat 


0.00069 


26.3 


lOy 


fibrinogen_C 

. . 


Fibrinogen beta and gamma chains, 
C-term 


5.3e-180 


611.4 


i7n 
1 /u 


fibrinogenjC 


Fibrinogen beta and gamma chains, 
C-term 


5.3e-180 


611.4 


171 


f^r^ o 

11 Drinogen__C 


Fibrinogen beta and gamma chains, 
C-term 


le-149 


510,8 


17^ 


r J- 

floineobox 


Homeobox domain 


I.5e-29 


111.6 


174 


FYVE 


Jh Y VE zinc fineer 


/.4e-zo 


103.8 


175 


GRIP 


GRIP domain 


3.9e-08 


40.5 


182 


pkinase 


Eukaryotic protein kinase domain 


3.4e-71 


250.0 


185 


CAP GLY 


CAP-Gly domain 


5.6e-51 


182.8 


186 


TBC 


rBC domain ; 


2.2e-50 


180.8 


187 


TBC 


FBC domain : 


2.2e-50 


180.8 
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PFAMNAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


188 


PDZ 


PDZ domain (Also known as DHR or 
GLGF). 


4e-13 


57.0 


189 


Kelch 


Kelch motif 


5.2e-106 


365.6 


190 


Tropomyosin 


Tropomyosms 


3.8e-171 


535.4 


192 


Rieske 


Rieske [2Fe-2S] domain 


0.0016 


18.5 


199 


ig 


Immunoglobulin domain 


5.9e-19 


66.1 


202 


EGF 


EGF-like domain 


3.4e-54 


193.5 


203 ' 


trefoil 


Trefoil (P-type) domain 


le-24 


95.5 


204 


TBC 


TBC domain 


8.5e-38 


139.0 


205 


efhand 


EFhand 


0.0096 


22.6 


206 


ISK^Channel 


Slow voltage-gated potassium 
channel 


0.0031 


8.1 


207 


trefoil 


Trefoil (P-type) domain 


2.9e-48 


173.7 


209 


Ribosomal S13 


Ribosomal protein S13/S18 


1.2e-78 


274.7 


210 


hemopexin 


Hemopexin 


1.3e-62 


221.5 


213 


TBC ^ 


TBC domain 


2.5e-48 


174.0 


215 


Basic 


Myogenic Basic domain 


4.3e-50 


179.8 


216 


Ribosomal L24 


KOW motif 


8.2e-23 


89.2 


222 


fii3 


Fibronectin type III domain 


7.3e-141 


481.4 


223 


cofilin_ADF 


Cofilin/tropomyosin-type actin- 
binding pr 


9.3e-47 


168.8 


224 


efhand 


EF hand 


6.1e-06 


33.2 


225 


Pterin_4a 


Pterin 4 alpha caibinolamine 
dehydratase 


9.3e-42 


152.1 


228 


ABC tran 


ABC transporter 


4.1e-110 


379.2 


234 


El DerP2 DerF 
2 


El femily 


3.7e-90 


312.9 


235 


El DerP2 D»F 

2 


El family 


1.6©-48 


174.6 


237 


PMP22 Claudin 


PMP-22/EMP/MP20/Claudin family- 


1.7e-25 


98.1 


- 


OpijDdsjieurbpe . 
P 


Vertebrate endogenous opioids 
neurope t - 


1.8e-159^- . 


5,43,2 


239 


eIF-5a 


Eukaryotic initiation factor 5A 
hypusine 


5.9©-104 


358.8 


240 


Amino_oxidase 


Flavin containing amine oxidase 


2.5e-ll 


37.8 


243 


zf.C2H2 


Zmc fmger, C2H2 type 


2.1e-99 


343.6 


244 


Band 7 


SPFH domain / Band 7 family 


2.3e-53 


190.7 


245 


ank 


Ank repeat 


1.6e-88 


307.5 


246 


zf-C2H2 


Zinc finger, C2H2 type 


6.7e.49 


175.9 


247 


actin 


Actin 


2.3e-42 


140.3 


248 


ER_lumen_recep 
t 


ER lumen protein retaining receptor 


2.4e-155 


529.5 


250 


PMP22_Claudin 


PMP-22/EMP/MP20/Claudin family 


2.2e-38 


140.9 


252 


Collagen 


Collagen triple helbc repeat (20 
copies) 


1.4e-13 


58.6 


255 


C2 


C2 domain 


0.052 


7.8 


257 


CAP GLY 


CAP-Gly domain 


1.4e-20 


81.8 


260 


WD40 


WD domain, G-beta repeat 


9.9e-62 


218.5 


261 


WD40 


WD domain, G-beta repeat 


9.9e-62 


218,5 


262 


WD40 


WD domain, G-beta repeat 


9.9e-62 


218.5 


263 


co£ilin_ADF 


Cofilin/tropomyosm-type actin* 

binding pr 


7.8e-21 


82.6 


264 


RibosomaI_L14 


Ribosomal protein L14p/L23e 


9.2e-10 


40.6 


265 


SAPA 


Saposin A-type domain 


4.4e-27 


103,4 


266 


SAPA 


Saposin A-type domain 


4.4e-27 


103.4 


267 


ABC tran 


ABC transporter 


9.5e-39 


142.2 


269 


Ribosomal__L14 


Ribosomal protein L14p/L23e 


6.2e-62 


219.2 


270 


abhydrolase 


alpha/beta hydrolase fold 


0.042 


-3.3 


272 


ras 


Rasfunily 


4.3e-87 


302.8 
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DESCRIPTION 
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PFAM 
SCORE 


273 




RNA recognition motif. 


0.074 


14.6 


275 


liDOcaHn 


Lipocalin / cytosolic &tty-acid 

DiiicLujg pr 


2.5e-41 


146.4 


276 


ras 


Pulo XalLLliy 


l.le-67 


238.3 


277 


UCH 


uunjuiun i^drDoxyi-ieniiinai 
uyuixjisioVf lajnn 


1.2e-147 


503.9 


278 


START 


START Hr»m urn 


3.2e-09 


44.1 


279 


WD40 


vv u uuiiiiiui, u**D6ia rspcai 


1.8e-27 


104.7 


282 


G-patch 


VJ paivll UUmalJl 


7.8e-22 


86.0 


287 


Anti Droll ferat 


1 Vj 1 LaUliiy 


1.2e-101 


351.0 


289 


KRAB 


KTf hoY 


7.1e-21 


82.8 


293 


7tm 3 


/ u cuxMiiCLLLPraiie rscepior 


3.3e-73 


256.6 


295 


SET 


SET domain 


5e-30 


113.2 


296 


Pvridny nviHncp 


rynaoxamme 5 -pnospnate oxidase 


I.3e-76 


268.0 


297 




KNA recognition motif. 


5.4e-45 


162.9 


298 


Ubie^methyltran 


ubiE/C0Q5 methyltransferase family 


6.3e-05 


-96.3 


299 


w uic__juicuiy lu ail 


ubUb/C<jQ5 methyltransferase family 


0.0024 


-118.1 


301 




r AU/NAD-bmaing Cytochrome 
reductase 


7.7e-61 


215.5 


302 




u-patcli domam 


3.1e-14 


60.7 


307 


7tm 1 


7 transmembrane receptor (rhodopsin 
lamuy^ 


7.7e-43 


138.2 


308 


PH 


rri aomam 


0.0015 


17.8 


310 


7tm 1 


7 transmembrane receptor (rhodopsin 

TOWl tlx / I 

lamiiy ^ 


L4e-84 


270.8 


311 


Rhodane^p 


ivnoaanese-iiKe clomam 


3.3e-64 


226.7 


312 


tubulin 


1 uDUiUi/risz, ramily 


4.9e-286 


963.6 


314 


SURF4 


oux\j74 lamiiy 


1.2e-199 


676.6 


325 


IMS 


inipj3/muci5/sanxo lamily 


2e-58 


207.5 


327 


cadherin 


Cadherin domain 


4.3e-91 


316.0 


329 


NAC 


NAC domain. . 


2.1e-28 


107.8 


330 


IP trans 


rnospnaiiayimositol transfer protein 


6.5e-98 


338.7 


332 


TFllS 


iranscnpnon tactor b-ll (TFIIS) 


8.8e-05 


29.3 


337 


zf-C2H2 


^mc rmger, y^Zru, type 


3.6e-61 


216.6 


"340 


AIRS 


/vuv synmase reiatecl protein 


4e-32 


120.2 


343 


annexin 


/xnncxm 


4.6e-80 


279.4 


346 


Stathmin 


oiammm lamiiy 


L8e-90 


314.0 


347 


Ribosomal L16 


Ribosomal protein L16 


4.6e-09 


34.9 


348 




Metallo-beta-Iactamase superfamily 


0.012 


-6.0 


351 


efliand 


nana 


2.5e-14 


61.0 


353 


lectin c 


Lectin C-type domain 


1.3e-05 


32.1 


354 


WD40 


WD domain, G-beta repeat 


2.2e-18 


74.5 


360 


linnrfiliri 


Lipocalin / cytosolic fatty-acid 
□inuuig pr 


6.3e-10 


38.3 


362 


Acetvltransf 


Aceiyiiransrerase (^oiN A I J lamiiy 


0.0019 


24.9 


365 


tRNA-svnt 1 


uuN/i synmetases class 1 (1, L, M and 
^ J 


4.6e-185 


628.2 


366 


Sulfatase 




6.1e-228 


770.6 


368 


START 




3.8e-ll 


50.5 


369 


pkinase 


liUAdryouc proiem Kmase aomam 


2.4e-10 


41.3 


370 


ACBP 


rxKtj I v^-u/Tk Dinumg proiem 


4.4e-56 


199.7 


371 


pkinase 


iZrUK.ciryouc prorem Kinase domain 


1.6e-94 


327.5 


373 


EGF 


jQvjp-iiKe Qomam 


2.6e-12 


54.3 


375 


zf-C2H2 


Zmc finger, C2H2 type 


8.2e-64 




377 


KRAB 


ICJRABbox 


3.7e-27 


103.7 


379 


SET 


SET domain 


7.3e-61. 


215.6 


380 ( 


31yco transf 8 


Glycosyl transferase family 8 ( 


0,0028 


■40.1 


381 ~: 


sf-C2H2 


ZiQC finger, C2H2 type 


1.3e.06 


53.7 


383 ( 


31yco_traiisf_8 ( 


alycosyl transferase family 8 ( 


).0028 


■40.1 



189 



wo 01/57190 



PCT/USpl/04098 



SEQID 
NO: 


PFAM NAIVfE 


DESCRIPTION 
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384 


RasGEF 


RasGEF domain 


8.1e-43 


155.7 


385 


TBC 


TBC domain 


0.017 


-66.6 


389 


Glycos_transf_2 


Glycosyl transferases 


1.3e-15 


65.3 


390 


Na Ca Ex 


Sodium/calciimi exchanger protein 


3.9e-105 


362.7 


391 


fo3 


Fibronectin type III domain 


4.1e-102 


352,6 


392 


fo3 


Fibronectin type III domain 


3.4e-45 


163.6 


393 


fQ3 


Fibronectin type m domain 


3.4e-45 


163.6 


394 


ldl_recept_b 


Low-density lipoprotein receptor 
repeat 


7.1e-49 


175.8 


395 


Ribosomal L30 


Ribosomal protein L30p/L7e 


0,0023 


16.0 


396 


Oxysterol__BP 


Oxysterol-binding protein 


1.5e-94 


327.5 


397 


RDS ROMl 


PeripherinAom-1 


2.9e-33 


123.9 


399 


lactamase^B 


Metallo-beta-lactamase superfemily 


3.4e-39 


143.6 


402 


F-box 


F-box domain. 


0.0002 


28.1 


403 


CLP_protease 


Clp protease 


4.8e-64 


226.2 


405 


Ribosomal L35 
Ae 


Ribosomal protein L35Ae 


6e-77 


269.0 


406 


LIM 


LIM domain containing proteins 


0,00021 


20.7 


410 


tRNA-synt__lc 


tRNA synthetases class I (E and Q) 


le-236 


799.8 


411 


NTP transf 2 


Nucleotidyltransferase domain 


3.9e-I6 


67.0 


412 


DEAD 


DEAD/DEAH box helicase 


0.00016 


17,2 


414 


DUF94 


Domain of unknown function DUF94 


0.00011 


26,9 


415 


tubulin 


Tubulin/FtsZ family 


4.5e-289 


973.7 


420 


SET 


SET domain 


3.3e-57 


203.5 


421 


WD40 


WD domain, G-beta repeat 


6.1e-29 


109.6 


423 


zf-C2H2 


Zinc finger, C2H2 type 


l,5e-39 


144.9 


424 


pkinase 


Eukaryotic protein kinase domain 


8.9e-75 


261.8 


428 


LIM 


LIM domain containing proteins 


1.8e-34 


126.7 


431 


kazal 


Kazal-type serine protease inhibitor 
domain 


3.7e-18 


73.8 


432 


SH2--. 


Src homology domain T^^r^t - . .v , ^ 


. ,l:4e-67 t 


198.4 


433 


zf-C2H2 


Zinc finger, C2H2'type 


2.8e-144 


492.7 


434 


ras 


Ras family 


0,012 


-106.8 


436 


El-E2_ATPase 


E1-E2 ATPase 


1.6e-117 


391.0 


437 


RNA_pol A 


RNA polymerase alpha submiit 


0 


1077.7 


438 


PHD 


PHD-finger 


1.6e-ll 


51,7 


439 


lectin_c 


Lectin C-type domain 


4.7e-30 


113.3 


440 


zf-C2H2 


Zinc finger, C2H2 type 


hle-65 


231.6 


441 


arrestin 


Arrestin (or S-antigen) 


2.9e-254 


858.1 


442 


aminotran_3 


Aminotransferases class-III 
pyridoxal-pho 


8,2e-80 


231.1 


443 


UCH-1 


Ubiquitin carboxyl-terminal 
hydrolases &mil 


8.5e-12 


52.6 


444 


CTF_NFI 


CTF/NF-I family 


2,6e-277 


934.6 


451 


T-box 


T-box 


3.8e-117 


402.6 


453 


Rieske 


Rieske [2Fe-2S] domain 


2.6e-13 


57.7 


454 


zf-C2H2 


Zinc finger, C2H2 type 


3.9e-64 


226.5 


456 


homeobox 


Horaeobox domain 


2.8e-08 


38.9 


459 


ig 


Immunoglobulin domain 


2.6e-20 


70.5 


460 


Hydrolase 


haloacid dehalogenase-like hydrolase 


4e-25 


96.9 


462 


rve 


Integrase core domain 


1.6e-13 


50.7 


466 


CH 


Calponin homology (CH) domain 


2.4e-17 


71.1 


467 


CH 


Calponin homology (CH) domain 


2.4e-17 


71.1 


468 


Sterol_desat 


Sterol desaturase 


7.5e-38 


139.2 


469 


pro_isomerase 


Cyclophilin type peptidyl-prolyl cis- 
tr 


2.6e-63 


220.9 


470 


Peptidase M24 


metallopeptidase fEunily M24 


6e-08 


28.1 




PDZ 


PDZ domain (Also known as DHR or 
GLGF). 


5.4e-129 


441.9 
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472 


bindinff 


Myo-iiKe UJNA-binding domain 


3.6e-06 


33.9 


473 


7.7. 


7p ■= 

^mc imger present m dystrophin, CB 


0.012 


20.0 


474 


EFlG_domain 


Elongation factor 1 gamma, 
conservea aoma 


6.3e-88 


305.5 


475 


Ribosomal L31p 


jxiDosomai protein L31e 


6.1e-66 


232.5 


476 


Clq 


Clq domain 


2.5e-75 


263.7 


All 


SID 


OXX7 Qomain 


l.le-12 


55.6 


478 


MoaA NifB Pn 
qE ~ ~ 


moa/v / niio / pqqb lamily 


0.002 


-17.7 


479 


FYVE 


r I VE zinc nnger 


9.3e-21 


78.6 


480 


DNA_pol A 


polymerase tamiiy A 


2.3e-46 


167.4 


482 


adh short 


snon cnain Qenycirogenase 


1.2e-62 


221.6 


483 


ank 


/AiLit repeal 


1.3e-17 


71.9 


484 


IMS 


miprj/mucii/samB namily 


2.2e-83 


290.5 


486 


TIR 


1 IK aomam 


3.2e-19 


67.8 


487 


FMO-like 


riavm-omomg monooxygenase-like 


0 


1425.5 


488 


I_LWEQ 


I/LWEQ domain 


9,5e-101 


341.0 


495 




Homeobox domain 




30.8 


497 


pkinase 


Eukaryotic protein kinase domain 


2.3e-l 66 




499 


fii3 


Fibronectin type IE domain 


2.5e-237 


801.8 


501 


LRR 


Leucine Rich Repeat 


93e-31 


115.6 


502 


XVVJO 


Regulator of G protein signaling 
domam 


0.041 


11.9 


503 
505 


iiiaiii.c£ii 


Intermediate filament proteins 
Fibronectin type III domain 


le-142 


487.5 


506 


WRPT 


HbCl -domam (ubiquitin- 
transferase). 


1.3e-100 
le-13 


347.7 
59.0 


507 


X\JUU2»Ulllill JLf //I 


Ribosomal protein L7Ae 

— rr ■ . . 


5.7e-26 


99.7 


508 




WJL) domam, G-beta repeat 


0.063 


19.8 


509 ^ . 


MXO :: 


WD domain, G-beta repeat 


0.063 


19.8 


510 


WD40 


wu aomam, G-beta repeat 


2.1e>42 


154.3 


511 


L^XvLLIUOW 


Hukaryotic protein kinase domain 


2.3e-86 


300.4 


512 




GGL domain 


1.9e-08 


34.3 


513 


SH3 


orij aomam 


3e-06 


34.2 


515 


Hl'H AmP 

A A X J. J._jfM Cl\.^ 


Bacterial regulatory helix-tum-helix 
protei 


3.9e-27 


103.6 


516 


zf-C2H2 


Zinc finger, C2H2 type 


I.7e-34 


128.0 


517 


SI 


til KNA binding domain 


6.1e-58 


205.9 


518 


pkinase 


xiUKaryouc protem Kmase domam 


1.8e-75 


264.2 


525 


cadherin 


v^auiicriQ aomain 


2e-80 


280.6 


528 
529 


zf-C2H2 
neur chan 


•oinc nnger, czxiz type 


4e-70 


246.4 


531 


RhoGEF 


iNcuiuLransmiiier-gaiea lon-ctiannel 
xMiuvjur aomam 


5.8e-222 
3.5e-44 


750.8 
160.2 


532 


myosin head 


myubui neaa ^^moior aomam) 


0 


1494.5 


533 


LRR 


i^cucuic ivicn ivepeat 


8.3e-15 


62.6 


535 


Sec7 


Sec7 domain 


5.1e-92 


319.1 


536 


homeobox 


tiomeooox aomam 


4,8e-05 


26.4 


539 


actin 


Actm 


2.4e-100 


330.6 


542 


ank 


Ank repeat 


1.9e-35 


131.2 


544 


zt-CCCH 


Zmc fmger C-x8-C-x5-C-x3-H type 


2.8e-10 


41.7 


546 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


2.4e-40 


147.4 


547 ] 

549 1 

551 ] 

552 ] 


HMG_CoA_synt ] 

aminin G ] 
PHD ] 
PDZ ] 


tiyaroxymethylglutatyl-coenzymeA i 
synthas 

Laminin G domain 

'HD-finger ( 
=*DZ domain (Also known as DHR or 1 


0 

5.3e-76 : 
).008 5 
).0017 : 


1250.8 

166,6 

15.0 1 
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PFAMNAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 






GLGF). 






555 


WW 


WW domain 


L3e-24 


95.3 


558 


kinesin 


Kinesin motor domain 


L8e-176 


599.7 


559 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.00085 


16.5 


563 


efhand 


EFhand 


7.9e-ll 


49.4 


567 


PH 


PH domain 


7.8e-06 


25.9 


568 


PH 


PH domain 


3.1e-39 


143.8 


569 


Hist deacetyl 


Histone deacetylase family 


5.2e-106 


365.6 


570 


PDZ 


PDZ domain (Also known as DHR or 
GLGF). 


3.4e-20 


80.5 


571 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


le-16 


58.5 


573 


ubiquitin 


Ubiquitin family 


1.4e-08 


31.1 


574 


FH2 


Formin Homology 2 Domain 


1.3e-110 


380.9 


576 


serpin 


S^ins (serine protease inhibitors) 


4.3e-146 


496.4 


579 


zf-C2H2 


Zinc finger, C2H2 type 


57e-76 


265.8 


580 


pkinase 


Eukaryotic protein kinase domain 


6.9e.79 


275.5 


581 


RhoGAP 


RhoGAP domain 


4.4e-53 


189.8 


582 


RibosomaIJL7A 
e 


Ribosomal protein L7Ae 


0.028 


1.0 


584 


kazal 


Kazal-type serine protease inhibitpr 
domain 


2.2e-52 


187.4 


585 


LRR 


Leucine Rich Repeat 


4.4e-28 


106.7 


586 


PHD 


PHD-finger 


3.8e-12 


53.8 


588 


GTP1_0BG 


GTTl/OBG family 


l.le-62 


215.2 


590 


Collagen 


Collagen triple helix repeat (20 
copies) 


8e-42 


152.4 


591 


lys 


C-type lysozyme/aipha-lactalbumin 
family 


1.6e-31 


116.4 , 


596 ' r 




Acyl CoA binding protein 


.0.0022 • 


-9.4 


597 ~ - 


SNF2_N 


SNF2 and others N-terminal domain 


3.7e-98 


339.5 


600 


KRAB 


KRAB box 


1.3e-29 


111.8 


606 


LRR 


Leucine Rich Repeat 


le-05 


32.5 


607 


LRR 


Leucine Rich Repeat 


le-05 


32.5 


608 


WD40 


WD domain, G-beta repeat 


53e-23 


89.8 


610 


cpa60 TCPl 


TCP-l/cpn60 chaperonin family 


1.7e-237 


802,4 


613 


THF DHG CY 
H 


Tetrahydrofolate 
dehydrogenase/cyclohydro 


4.9e-173 


588.3 


617 


mn 


RNA recognition motif. 


4e-14 


60.4 


618 


mn 


RNA recognition motif. 


4e-14 


60.4 


620 


cofiliii_ADF 


Cofilin/tropomyosin-type actin- 
binding pr 


3e-06 


34.2 


621 


Nop 


Putative snoRNA binding domam 


6.1e-95 


328.8 


622 


UCH-2 


Ubiquitin carboxyl-terminal 
hydrolase family 


5.8e-21 


83.1 


625 


zf-C2H2 


Zinc finger, C2H2 type 


2.5e-124 


426.4 


628 


DEAD 


DEAD/DEAH box helicase 


2.5e-68 


219.0 


632 


GST 


Glutatiiione S-transferases. 


4.8e-26 


89.0 


633 


5 nucleotidase 


5'-nucleotidase 


6.6e-248 


837.0 


636 


LIM 


LIM domain containing proteins 


1.6e-88 


307.5 


637 


pkinase 


Eukaryotic protein kinase domain 


1.5e-73 


257.8 


638 


MSP domain 


MSP (Major sperm protein) domain 


8.4e-09 


42.7 


639 


metaithio 


Metallothionein 


2e-24 


94.6 


641 


zf-C2H2 


Zinc finger, C2H2 type 


6.1e-114 


391.9 


642 


Ribosomal S28e 


Ribosomal protein S28e 


9.3e^8 


172.1 


643 


Rlbosomal_S5 


Ribosomal protein S5 


8.3e-87 


301.8 


646 


PHD 


PHD-finger 


0.00025 


23.1 


647 


WD40 1 WD domain, G-beta repeat 


1.5e-22 1 88.4 
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i^iHscKurrioN 


p-vaiue 


PFAM 
SCORE 


648 


Lipase GDSL 


like motif 


U.Ui J 


2.2 


652 


2f-C2H2 


Zinc fino^er P9H9 tvr»A 


4.le-14o 


498.8 


653 


histone 


Core h i ^trmp H9 A /H'? R /T-T'i /14^ 
N^vi^ uiaivjuc xxfc/T/ J^l^D/^lJ/^l^' 


l.2e-i0 


48.8 


654 


zf-C2H2 


Zinc ftnjrer fur*** 
Ajuiv/ Auigci J \^£tLxxt type 


i.ye-o/ 


303.9 


655 


ras 


xxMo l.ulUlljr 


o.4e- / / 


269.0 


657 


~ 2f-C3HC4 


finger) 




46.4 


658 


STphosphatase 






619.1 


659 


zf-C2H2 


Zinc finger, C2H2 type 


1.3e-92 


321.1 


660 


2f-C2H2 


^uii/ liiigcr, v^i^xx^ Type 


1.5e-85 


297.6 


662 


NDK 


1.^ uuicuaiUC uipuObpilalc Kinases 


1.4e-119 


410.7 


664 


IRF 


-LULCiicruu icgiuaiOiy laCrar 
trjiTiQprinfinti ^ 


7e-20 


79.5 


"665 


4HPPD_C 


' ii/uiuA^|jiicjjyipyr uvatc 


1.4e-jo 


68.5 


666 


DEAD 


DEAD/DEAH hnv hplimcp 


A Oa ha 

4.5e-74 


237.1 


667 


DEAD 


DEAD/DFAH hnv hp1i>ac«> 


2.9e-70 


225.1 


669 


pkinase 


x^uxvoi jri/uv ^ll7LCUi IVUlcldC/ UUUlaUl 


u.le-yj 


322.2 


671 


homeobox 


Homeobox domain 


0.018 


16.5 


678 


crvstall 


DcLa/uraiunia crysiaiun 


4.7e-106 


365.8 


679 


WD40 


" ■1-' uuniaui, *j-ocia repear 


1 .9e-06 


34.9 


680 


Keratin B2 


Keratin, high sulfur B2 protein 


4.1e-06 


15.9 


682 




vjoLf aoniain 


8.5e-33 


117.9 


685 


UCH-2 


uoiquiLjn carDoxyi-tenmnal 

uyurujid.bc laniiiy 


1.4e-29 


111.7 


686 


Acetvltransf 


/\c6iyiiransierase v^JNAl J lamily 


6,6e-10 


46.4 


687 


7tm I 


/ u^auoijicizioraue recepior ^rnoctopsin 

XOlllllJf y 


4.6e-15 


50.0 


688 


proteasome 


ProteaQnmp A-tvnf» 5»r»H H +\/rto 


C CA 


225.7 


689 


SCP2 


SCP-2 sterol transfer family 


6.2e-37 


136.1 


690 


TS-N _ 


1 0"JI> UVLLLalll 


0.041 


20.1 


692 


zt-C2H2 


^uic liiigcr, ^zjtiz lype 


9.9e-60 


211.9 


693 


zf-MYND 


X linger 


0.038 


5.5 


694 


Oxysterol_BP 


Oxysterol-binding protein 


3.9e-133 


455.7 


695 


PDZ 


ruLj aomaui ^^Aiso Known as UHK or 


L3e-30 


115.1 


703 


Peptidase C2 


t^nlniim TsimilT/ r>^ic4'A'imA ««««M'i-A*VMA 

v/aipdin lamiiy cysieine protease 


2.3e-175 


596.0 


706 


filament 


xixicxLiicuiaic iJlaiUclll prOlcinS 


7.2e-107 


368.5 


710 


fibrinogen C 


A ixjx luugcii o6La dna gamma cnams, 
C-tenn 


7e-80 


278.0 


711 


SH2 


cjiw iiv/xiivjiu^ y iiuiiitiin ^ 


2.3e-65 


192.1 


712 


ATP-synt DE 


ATP <SVnt]lACP D^Iffl/Pncili^n /•l^ain 
«^<Lx ajrxiuiCuCj X/wlLa/JDpSllOiI vflflul 




19.0 


713 


ARID 


Altrri D^A l^inHino* Hrkmain 


2e-17 


71.3 


714 


LBP BPI CETP 


LBP / BPI / CFTP familv 
i-fMj± 1 ux i 1 1 r icUilliy 


o.oe-i4 


125.7 


715 


RNAj)ol_L 


RN^A nolvThpraQPQ T / 1 ^ tr» l vpio 
subunit 


4.oe-4y 


176.3 


716 


KRAB 


KRAB box 


1 2a 

i.oe-42 


155.0 


717 


mito carr 


Mitochondrial carrier proteins 


4.8e-38 


133.3 


719 


Gal-bind lectin 




1 .De-2j 


90.2 


726 


aldedh 


Aldehyde dehydrogenase family 


1.3e-119 


410.8 


728 


Glycos transf 2 


vJiyLUoyl LI axisisrases 


4e-21 


83.6 


734 


ELM2 


xzt'Uivjl^ uuxnain 


2e-34 


127.8 


735 


PR55 


Protem phosphatase 2A regulatory 
subunit PR 


0 


1038 9 


737 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


4e.l4 


60.4 


740 


WD40 


WD domain, G-beta repeat 


5.6e.l4 


59.9 


745 


2f-C3HC4 


Zinc finger, C3HC4 type (RING 


3.8e-13 


^^6.9 
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PFAM 
SCORE 






finger) 






749 


mito carr 


Mitochondrial carrier proteins 


4.5e-67 


232.8 


750 


DUF27 


Domain of uniaiown fiinction DUF27 


4.5e-12 


53.5 


751 


SH3 


SH3 domain 


3.6e-17 


70.5 


752 


HMG box 


HMG (high mobility group) box 


8.6e-13 


55.9 


753 


SPRY 


SPRY domain 


5.9e-05 


23.3 


754 


GTP CDC 


Cell division protein 


7.5e.l53 


521.2 


755 


mito carr 


Mitochondrial carrier proteins 


3e-88 


305.4 


756 


TSPN 


ThrombospondinN-terminal -like 
domains 


8.1e-58 


205.5 


757 


BTB 


BTB/POZ domain 


5.7e-23 


89.7 


759 


zf-'C2H2 ' 


Zinc finger, C2H2 type 


1.2e-12 


55.4 . 


760 


NSF 


NSF attachment protein 


6.4e-127 


435.1 


762 


Ribosomal S14 


Ribosomal protein S14p/S29e 


2.1e-06 


24,8 


765 


ThiF^family 


fhlF ^unily 


1.7e-39 


144.6 


766 


DnaJ 


DnaJ domain 


3.9e-36 


133.5 


768 


tRNA-synt_2b • 


tRNA synthetase class 11 


9.1e-81 


281.7 


769 


ldl_recq>t_a 


Low-density lipoprotein receptor 
domain 


0 


1404.5 


770 


WD40 


WD domain, G-beta repeat 


2e-21 


84.6 


771 


LRR 


Leucine Rich Repeat 


3.8e-06 


33.9 


774 


SNF2 N 


SNF2 and others N-tenninal domain 


5;5e.99 


342.3 


776 


VPS9 


Vacuolar sorting protein 9 (VPS9) 
domain 


l.le-30 


115.4 


777 


VPS9 


Vacuolar sorting protein 9 (VPS9) 

domain 


Lle-30 


115,4 


778 


VPS9 


Vacuolar sortmg protein 9 (VPS9) 
domain 


Lle-30 


115.4 


779 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


3.1e-08 


31.0 


781 


• cadherin , :. 


Cadherin dpnmm?^^ 


5.6e-113 


388.7.,:, 


783 


..HECT--" 


HECT-domain (ubiquitin- • - * 
transferase). 


4:2e-3r 


116.8 - 


785 


sushi 


Sushi domain (SCR repeat) 


1.8e-60 


214.3 


786 


sushi 


Sushi domain (SCR repeat) 


1.8e-60 


214.3 


788 


vwa 


von Willebrand factor type A domain 


1.9e-52 


187.7 


790 


rrm 


RNA recognition motif. 


2.8e-20 


80.8 


791 


Collagen 


Collagen triple helix repeat (20 
copies) 


0.00097 


9.7 


792 


pkinase 


Eukaryotic protein kinase domain 


0.023 


12.4 


795 


zf-C2H2 


Zinc finger, C2H2 type 


6.5e-95 


328.7 


796 


adh short 


short chain dehydrogenase 


4.1e-05 


-7.3 


799 


SAICAR^synt 


SAICAR synthetase 


6e-125 


428.5 


805 


WD40 


WD domain, G-beta repeat 


4e-65 


229.8 


806 


ZU5 


ZU5 domain 


4.7e-37 


136.5 


807 


WD40 


WD domain, G-beta repeat 


0.016 


21.8 


808 


WD40 


WD domain, G-beta repeat 


0.0041 


23.8 


809 


pkinase 


Eukaryotic protein kinase domain 


2e-31 


117.2 


810 


vwa 


von Willebrand factor type A domain 


L9e-52 


187.7 


814 


zf-C2H2 


Zmc finger, C2H2 type 


4.5e-83 


289.4 


815 


2f-C2H2 


Zinc finger, C2H2 type 


6e-74 


259.1 


817 


myosin head 


Myosin head (motor domain) 


1.5e-176 


599.9 


818 


GSPI^E 


Bacterial type II secretion system 
protein 


0.012 


11.5 


819 


PDEase 


3'5 -cyclic nucleotide 
phosphodiesterase 


l.le-74 


215.5 


821 


PH 


PH domain 


0.00025 


20.5 


822- 


CNH 


CNH domain 


0.00015 


-24.7 


827 


nrn 


RNA recognition motif. 


1.5e-06 


35.2 
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SEQID 

NO: 

829 



830 



831 



833 



837 



838 



840 
842 



843 



845 



848 



850 

852 



853 
854 



856 



858 



860 



866 



868 



871. 



877 
882 



885 
886 



887 



888 



889 



890 



892 
893 



894 



895 
W 



898 



900 



901 
903 



PFAMNAME 



HMG box 



RasGEF 



DESCRIPTION 



CNH 



mito carr 



PX 



Yjhosphatase 



auk 



ank 



Ribosomal L15e 



SNF 



Peptidase_M16 



HMG (high mobility group) box 



RasGEF domain 



CNH domain 



Mitochondrial carrier proteins 



PX domain 



Protein-tyrosine phosphatase 



Ank repeat 



Ank repeat 



Ribosomal LI 5 



Sodiumineurotransmitter symporter 
family 



EFIBD 



2f-C2H2 



zf-C2H2 



SIS 



RhoGAP 
PDZ 



ACOX 



efhand 
homeobox 



TFIIF beta 



A2M . 

MoCFJsiosynth 



Insulinase (Peptidase family M16) 



EF-1 guanine nucleotide exchange 
domain 



p-value 



7.8e-34 



2.2e-102 



3e-I18 



3.7e-37 



2.7e-19 



1.6e-263 



2.4e-270 



5.8e-38 



4.8e-131 



4.7e-67 



226-56 



Zinc fmger, C2H2 type 
Zinc fmger, C2H2 type 



SIS -domain 



RhoGAP domain 



PDZ domain (Also known as DHR or 

GLGF). 



Acyl-CoA oxidase 



EFhand 



Homeobox domain 



Transcription initiation factor UP, 
beta 



1.5e-122 



2e-67 



3.8e-30 



l.le-37 



5.10-10 



9.1e-263 



2.4e-18 



4e-22 



EGF 



Alpfaa-2-macroglobulin family 



EGF 



PtPLCrX. 



UCH-2 



SH3 



SH3 



KRAB 



Molybdenum cofactor biosynthesis 

protei 

£GF-like domain 



EGF-like domain 



Phosphatidylinositol-specific 
phospholipase. 



Ubiquitin carboxyl-terminal 
hydrolase family 



SH3 domain 



SH3 domain 



ank 

biopterin_H 



KRAB box 



Ank repeat 



GTP EFTU 



zf'C3HC4 



zf-C2H2 



PTR2 
Sulfatase 



Sulfatase 
7tm 1 



Glyco_hydro 31 
chromo 



Cbl N 



vwa 



WD40 



zf-C2H2 



ras 



Biopterin-dependent aromatic amino 
acidh 



Elongation factor Tu family 



Zinc fmger, C3HC4 type (RING 
finger) 



Zinc finger, C2H2 type 



Immunoglobulin domain 



POT family 



2.2e-134 



4.9e-21 



5.8e-205 



4.1e-22 



l.le-22 



7.2e-95 



l.le-20 



2.2e-14 



8.6e-90 



6.9e-45 



7.1e-07 



4.9e-129 



I.6e-14 



3.7e-92 



3.8e-06 



Sulfatase 



Sulfatase 



7 transmembrane receptor (rhodopsin 
family) 



9.5e-48 



3.56-78 



3.5e-78 



4.5e-51 



Glycosyl hydrolases family 31 



'chromo' (CHRromatin Oi^ganization 
Modifier) 



CBL proto-oncogene N-termmal 
domain 



von Willebrand factor type A domain 



WD domain, G-beta repeat 



Zinc finger, C2H2 type 
Ras family 



3.9e-06 



L2e-273 



5.5e-32 



2.7e-07 



4e-156 
6.6e-101 



PFAM 
SCORE 



125.8 



353.5 



406.2 



130.3 



77.5 
888.8 



911.5 



139.6 



448.8 



1201.8 



236.2 



200.7 



420.5 



237.4 



113.6 



138.6 



46.7 



886.3 



74.4 



459,S 



70.9 



694.3 



86.9 



88.8 



328.6 



82.1 



61.2 



311.7 



162.6 



36.3 



988.3 



437.5 



51.4 



319.6 



24.8 



163.0 



273.2 
273.2 



164.4 



1277J 



26.0 



922.4 



119.7 



37.7 



532.1 
348.6 
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SEQID 
NO: 


PFAM NAME 


DESCRIPTION 


p-vaiue 


PFAM 
SCORE 


904 


Armadillo seg 


Armadillo/beta-catenin-like repeats 


I.le-06 


35.6 


906 


FH2 


Formin Homology 2 Domain 


4.5e-112 


385.7 


907 


Cytidylyltransf 


Cytidylyltransferase 


1.4e-05 


29.3 


908 


pkinase 


Eukaryotic protein kinase domain 


1.2e-64 


228.2 


909 


pkinase 


Eukaryotic protein kinase domain 


8.5e-70 


245.3 


910 


pkinase 


Eukaryotic protein kinase domain 


2.9e-42 


153.8 


911 


pkinase 


Eukaryotic protein kinase domain 


1.2e-35 


131.8 


912 


PHD 


PHD-finger 


5.1e-06 


33.4 


913 


PHD 


PHD-finger 


5.5e-16 


66.5 


916 


filament 


Intermediate filament proteins 


9.7e-121 


414,5 


917 


LIM 


LIM domain containing proteins 


5.9e.l5 


57.9 


918 


SAM 


SAM domain (Sterile alpha motif) 


4.3e-16 


66,9 


922 


Acylphosphatase 


Acylphosphatase 


2.9e-63 


223.6 


924 


ig 


Immunoglobulin domain 


1.3e-08 


32.8 


925 


Acyl-CoA_dh 


Acyl-CoA dehydrogenase 


2.4e-131 


449.8 


927 


7tm_l 


7 transmembrane receptor (rfaodopsin 
family) 


2.9e-45 


145.9 


928 


globin 


Globin 


2.4e-52 


186.9 


929 


sng&xjx 


Sugar (and other) transporter 


1.2e-16 


68.8 


932 


Collagen 


Collagen triple helix repeat (20 
copies) 


0.00097 


9.7 


933 


HMG box 


HMG (high mobility group) box 


7.8e-34 


125.8 


934 


SEA 


SEA domain 


0.0021 


24.7 


935 


ras 


Ras family 


6.4e-59 


209.2 


936 


CH 


Calponin homology (CH) domain 


3.8e-21 


83.7 


937 


voltage^CLC 


Voltage gated chloride channels 


1.9e-199 


676.0 


938 


homeobox 


Homeobox domain 


L9e-25 


98.0 


940 


pkinase 


Eukaryotic protein kinase domain 


9.9e-58 


205.2 


942 


Myosin tail 


Myosin tail 


3.7e-09 


38.2 


943 


zf-C2H2 


Zinc finger, C2H2 type 


2.2e-92 


320.3 


.945./ 


Clatjadapt6r_s 


Clathrin adaptor, complex small chain - 


1.3e-76 


268.0 -^->. 


946 ' 


'sugar_tr ~ 


Sugar (and other) transporter 


0.017 


-122.8 ' ' 


947 


tRNA-synt_le 


tRNA synthetases class I (C) 


0.00097 


15.6 


948 


PHD 


PHD-fmger 


2.2e-17 


71.2 


951 


sugar_tr 


Sugar (and other) transporter 


0.0082 


-113,9 


952 


mito_carr 


Mitochondrial carrier proteins 


1.7e-54 


189.7 


953 


myb_.DNA- 
binding 


Myb-like DNA-binding domain 


4.5e-20 


80.1 


955 


ketoacyl-synt 


Beta-ketoacyl synthase 


7.1e-133 


454.8 


957 


aldo ket red 


Aldo/keto reductase family 


1.5e.98 


340.8 


959 


Keldi 


Kelch motif 


0.02 


20.8 


961 


ras 


Ras family 


2.2e-29 


111.1 


964 


homeobox 


Homeobox domam 


5.4e-22 


86.5 


965 


PH 


PH domain 


3e-21 


80.9 


966 


zf-C3HC4 


Zmc fmger, C3HC4 type (RING 
finger) 


2:2e-09 


34.7 


967 


Ribosomal L29 


Ribosomal L29 protein 


.1.6e-15 


65.0 


970 


FAD_biQding_2 


FAD binding domain 


8.9e-47 


166.6 


971 


rve 


Integrase core domain 


0.00015 


19.8 


972 


Glycos_transf_2 


Glycosyl transferases 


2.1e-21 


84.5 


974 


Ribosomal LIO 


Ribosomal protein LIO 


3.3e-48 


173.6 


975 


7tm_l 


7 transmembrane receptor (ihodopsm 
family) 


1.6e-37 


121.3 


976 


zf-C4 


Zinc finger, C4 type (two domains) 


2.1e-52 


178.5 


977 


zf-C2H2 


Zinc fmger, C2H2 type 


6.6e-150 


511.4 


978 


FTHFS 


Formate—tetrahydrofolate ligase 


0 


1367.2 


982 


Renal_dipeptase 


Renal dipeptidase 


13e-73 


258.0 


984 


A deaminase 


Adenosine/AMP deaminase 


2.6e-05 


-48.6 
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TABLES 



SEQ ID NO: 
of full-length 
nucleotide 
sequence 



1 
2 
3 
4 
5 
6 
7 
8 
9 
10 



SEQ ID 
NO: of 
fulJ-length 
peptide 
sequence 



985 



986 



987 



SEQ ID NO: 
of contig 
nucleotide 
sequence 



1969 



1970 



1971 



SEQ ID NO: 
of contig 
peptide 
sequence 



2953 



2954 



2955 



Priority docket 
nuniber_correspondin 
g SEQ ID NO: In 
priority application 



787CIP2 1 



787CIP2 2 



787CIP2 3 



SEQ ID NO: in 
U.S.SJV. 09/496^14 



150 



223 



1884 



988 



989 



990 



991 



992 
993 



1972 



2956 



787CIP2 4 



1973 



2957 



1974 



787C1P2 5 



1975 
1976 



2958 
2959 



787CIP2 6 



2960 



1977 



787CIP2_7 
787CIP2 8 



2961 



787CIP2 9 



2123 



2313 



3284 



3324 



6182 



6210 



994 



1978 



2962 



787CIP2 10 



6213 



995 



1979 



2963 



787CIP2 11 



6257 



12 
13 



996 



1980 



2964 



787CIP2 12 



6294 



14 
15 
16 



997 



998 



999 



1981 



2965 



1982 



787CIP2 13 



2966 



1983 



787CIP2 14 



2967 



787CIP2 15 



6294 



6330 



6364 



17 



18 



19 



20 
21 



22 



1000 



1001 



1002 



1003 



1004 

1005 



1006 



1984 



2968 



1985 



787CIP2 16 



2969 



1986 



787CIP2 17 



2970 



1987 



787CIP2 18 



2971 



1988 



787CIP2 19 



2972 



1989 



787CIP2 20 



2973 



1990 



787CrP2 21 



2974 



787CIP2 22 



6455 



6486 



6503 



6528 
6572 



6578 



6593 



1007 



1991 



2975 



787CIP2 23 



6603 



25 
'26 
27 



1008 



1009 



1010 



1011 



1992 



2976 



1993 



787CIP2 24 



2977 



1994 



787CIP2 25 



2978 



1995 



787CIP2 26 



2979 



787CIP2 27 



6603 



6679 



6744 



6762 



1012 



1996 



2980 



787CIP2 28 



6770 



30 



31 



32 



33 



34 
35 
36 



37 



1013 



1014 



1015 



1016 



1017 



1018 



1019 



1020 



1021 
1022 



1997 



2981 



1998 



787CIP2 29 



2982 



1999 



787CIP2 30 



2983 



2000 



787CIP2 31 



2001 



2002 



2003 



2004 



2005 



2984 
2985 



787CIP2 32 



787CIP2 33 



2986 



787CIP2 34 



2987 



787CIP2 35 



2988 



787CIP2 36 



2989 
2990 



787CIP2 37 



6770 



6787 



6858 



6866 



6938 



6938 



6977 



7001 



7002 



39 



40 
41 



42 



1023 



1024 



1025 



1026 



2006 



2007 



787CIP2 38 



2991 



2008 



787CIP2 39 



2992 



2009 



787CIP2 40 



2993 



2010 



787CIP2 41 



2994 



787CIP2 42 



7004 



7005 



7006 



7008 



7014 



44 



1027 



1028 
1029 



2011 
2012 



2995 



787CIP2 43 



2996 



787CP2 44 



7021 



7022 



46 



47 



1030 



1031 



2013 



2997 



2014 



2998 



2015^ 
2016 



2999 



787CIP2 46 



7S7CrP2 47 



787CIP2 49 



7057 



7058 



7088 



48 
49 



J 032 



3000 



1033 



2017 



787CIP2 50 



3001 



787CIP2 51 



7089 



7182 



51 



52 
53 



1034 



2018 



1035 



2019 



1036 



2020 



1037 



3002 
3003 
3004 



787CIP2 52 



787CIP2 53 



2021 



787CIP2 54 



3005 



787CIP2 55 



7489 



7564 



7566 



7587 
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54 


1038 


2022 


3006 


787CIP2 56 


7591 


55 


1039 


2023 


3007 


787CIP2 57 


7600 


56 


1040 


2024 


3008 


787CIP2 58 


7604 


57 


1041 


2025 


3009 


787CIP2 59 


7612 


58 


1042 


2026 


3010 


787CIP2 60 


7613 


59 


1043 


2027 


3011 


787CIP2_61 


7615 


60 


1044 


2028 


3012 


787CIP2 62 


7616 


61 


1045 


2029 


3013 


787CIP2 63 


7617 


62 


1046 


2030 


3014 


787C1P2 64 


7623 


63 


1047 


2031 


3015 


787CIP2 65 


7625 


64 


1048 


2032 


3016 


787CP2 66 


7625 


65 


1049 


2033 


3017 


787CIP2 67 


7630 


66 


1050 


2034 


3018 


787CIP2 68 


7638 


67 


1051 


2035 


3019 


787CP2 69 


7640 


68 


1052 


2036 


3020 


787CIP2 70 


7670 


69 


1053 


2037 


3021 


787CIP2 71 


7676 


70 


1054 


2038 


3022 


787CIP2 72 


7688 


71 


1055 


2039 


3023 


787CIP2_73 


7690 


72 


1056 


2040 


3024 


787CIP2 74 


7700 


73 


1057 


2041 


3025 


787CIP2_75 


7774 


74 


1058 


2042 


3026 


787CDP2 76 


7784 


75 


1059 


2043 


3027 


787CIP2 77 


7785 


76 


1060 


2044 


3028 


787CIP2 78 


7792 


77 


1061 


2045 


3029 


787C1P2 79 


7798 


78 


1062 


2046 


3030 


787CIP2 80 


7807 


79 


1063 


2047 


3031 


787CIP2 81 


7810 


80 


1064 


2048 


3032 


787CIP2 82 


7812 


81 ; 


1065 


2049 


3033 


787CIP2 83 


7816 


82 


1066 


2050 


3034 


787CP2 84 


7826 


83 


1067 


2051 ^ 


3035 


787CIP2 85 


7842 


84 


1068 


2052 


3036 


787CIP2_86 


7850 


85 




2053 


.3037 . 


787CIP2_87 


7865 


86 . : I'T^'i^ 


::1070 :~ - 


2054V::- -tyr^ 


>3038-'".:^. ^• 


:787CIP2 88 >r,r- 


7882 


87 


1071 


2055 


3039 


787CIP2_89 


7891 


88 


1072 ' 


2056 


3040 


787CIP2 90 


7892 


89 


1073 


2057 


3041 


787C1P2 91 


7896 


90 


1074 


2058 


3042 


787CIP2_92 


7896 


91 


1075 


2059 


3043 


787CIP2 93 


7907 


92 


1076 


2060 


3044 


787CIP2 94 


7913 


93 


1077 


2061 


3045 


787CIP2 95 


7914 


94 


1078 


2062 


3046 


787CIP2 96 


7915 


95 


1079 


2063 


3047 


787CIP2„97 


7920 


96 


1080 


2064 


3048 


787CIP2 98 


7921 


97 


1081 


2065 


3049 


787CIP2_99 


7924 


98 


1082 


2066 


3050 


787CIP2 100 


7927 


99 


1083 


2067 


3051 


787CIP2 101 


7929 


100 


1084 


2068 


3052 


787CIP2 102 


7937 


101 


1085 


2069 


3053 


787CIP2 103 


7940 


102 


1086 


2070 


3054 


787CIP2 104 


7942 


103 


1087 


2071 


3055 


787CIP2J05 


7944 


104 


1088 


2072 


3056 


787CIP2_106 


7951 


105 


1089 


2073 


3057 


787CIP2 107 


7951 


106 


1090 


2074 


3058 


787CIP2 108 


7962 


107 


1091 


2075 


3059 


787CIP2 109 


7964 


108 


1092 


2076 


3060 


787CIP2_110 


7977 


109 


1093 


2077 


3061 


787C1P2_111 


7978 


110 


1094 


2078 


3062 


787CIP2_112 


7980 


111 


1095 


2079 


3063 


787CIP2 113 


7982 


112 


1096 


2080 


3064 


787CIP2 114 


8000 


113 


1097 


2081 


3065 


787CIP2 115 


8003 
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114 



115 



1098 



1099 



2082 



2083 



3066 



3067 



787CIP2 116 



787CIP2 117 
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8004 



8007 



117 



118 



1100 



2084 



3068 



1101 



2085 



787CIP2 118 



3069 



1102 



2086 



787CIP2 119 



3070 



787C1P2 120 



8008 



8009 



8013 



120 
121 



1103 



2087 



3071 



1104 



2088 



787CIP2 121 



3072 



787CIP2 122 



8017 



8018 



122 
123 



1105 



1106 



2089 

2090 



3073 



787Cff2 123 



3074 



787C1P2 124 



8021 



8022 



1107 



2091 



3075 



787CIP2 125 



8023 



125 



126 



127 
128 



1108 



2092 



3076 



1109 



2093 



787CIP2 126 



3077 



1110 



nil 



2094 



787CIP2 127 



3078 



2095 



787CIP2 128 



3079 



787CIP2 129 



8023 



8024 



8026 



8028 



129 

130 



131 



132 



133 



134 



135 



136 



137 



138 



139 



140 



141 



142 



143 



144 



145 



.1-46" 



147 



148 



1112 



2096 



3080 



1113 



2097 



787CIP2 130 



3081 



1114 



2098 



787CIP2 131 



3082 



1115 



2099 



787CIP2 132 



1116 



3083 



2100 



1117 



2101 



1118 



2102 



1119 



1120 



2103 
2104 



1121 



2105 



1122 



2106 



123 



1124 



1125 



1126 



127 



1128 



1129 



mo~- 



mr — 



1132 



2107 



2108 



2109 



2110 



2111 



2112 



2113 



211-4. 



2115 



2116 



3084 



3085 



3086 



3087 



3088 



3089 



3090 



3091 



3092 



3093 



3094 



3095 



3096 



3097 



3t)98 



3099- 



787CP2 133 



787CIP2 134 



787CIP2 135 



787CIP2_136 
787CIP2 137 



787CIP2 138 



787CIP2 139 



787CIP2 140 



787CIP2 141 



787CIP2 142 



787CIP2 143 



787CIP2 144 



787CIP2 145 



787CIP2 146 



787CIP2 147 



787CIP2 148 



3100 



787CIP2 149 



787CIP2 150 



8036 



8038 



8045 



8045 



8048 



8048 



8052 



8053 



8055 



8059 



8061 



8062 



8063 



8064 



8065 



8068 



8069 



8070 



8074 



8076 



8077 



150 



151 



1133 



2117 



3101 



1134 



2118 



787CIP2 151 



1135 



3102 



2119 



787CIP2 152 



3103 



787CIP2 153 



8078 



8079 



8087 



153 



1136 



2120 



3104 



1137 



2121 



787CIP2 154 



3105 



787CIP2 155 



8091 



8100 



1138 



2122 



3106 



787CIP2 156 



8105 



156 



157 



158 



159 



1139 



2123 



1140 



2124 



3107 
3108 



787CIP2 157 



1141 



2125 



787CIP2 158 



3109 



1142 



2126 



787CIP2 159 



3110 



1143 
1144 



2127 



787CIP2 160 



3111 



787CIP2 161 



8106 



8108 



8109 



8110 



8112 



2128 



3112 
3113 



787CIP2 162 



8116 



162 



163 



164 



165 



166 



167 



168 



1145 



2129 



1146 



2130 



1147 



3114 



2131 



3115 



1148 



2132 



1149 



3116 



2133 



1150 



3117 



2134 



1151 



2135 



1152 
1153 



2136 



3118 



3119 
3120 



787CIP2 163 



787CP2 164 



787CP2 165 



787CIP2 166 



787CIP2 167 



787CIP2 168 



787CIP2 169 



787CIP2 170 



8118 



8124 



8125 
8127 



8132 



8135 



8137 



8139 



170 



171 



172 
173 



2137 



3121 



1154 



2138 



787CIP2 171 



3122 



1155 



2139 



787CIP2 172 



1156 
1157 



3123 



2140 
2141 



787CIP2 173 



3124 



787CP2 174 



8140 



8140 



8140 



8141 



3125 



787CIP2 175 



8147 
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174 


1158 


2142 


3126 


787CIP2 176 


8149 


175 


1159 


2143 


3127 


787C1P2 177 


S150 


176. 


1160 


2144 


3128 


787CIP2 178 


8157 


177 


1161 


2145 


3129 


787CIP2 179 


8161 


178 


1162 


2146 


3130 


787CIP2 180 


8162 


179 


1163 


2147 


3131 


787CIP2 181 


8165 


ISO 


1164 


2148 


3132 


787CIP2 182 


8166 


181 


1165 


2149 


3133 


787CIP2 183 


8167 


182 


1166 


2150 


3134 


787CIP2 184 


8169 


183 


1167 


2151 


3135 


787CIP2 185 


8170 


184 


1168 


2152 


3136 


787CIP2 186 


8172 


185 


1169 


2153 


3137 


787CIP2 187 


8173 


186 


1170 


2154 


3138 


787CIP2 188 


8174 


187 


1171 


2155 


3139 


787CIP2 189 


8174 


188 


1172 


2156 


3140 


787C1P2 191 


8182 


189 


1173 


2157 


3141 


787CIP2 192 


8186 


190 


1174 


2158 


3142 


787CIP2 193 


8188 


191 


1175 


2159 


3143 


787CIP2 194 


8191 


192 


1176 


2160 


3144 


787CIP2 195 


8192 


193 


1177 


2161 


3145 


787CIP2 196 


8193 


194 


1178 


2162 


3146 


787CIP2 197 


8194 


195 


1179 


2163 


3147 


787CIP2 198 


8195 


196 


1180 


2164 


3148 


787CIP2 199 


8196 


197 


1181 


2165 


3149 


787CIP2_200 


8200 


198 


1182 


2166 


3150 


787CIP2_^201 


8201 


199 


1183 


2167 


3151 


787CIP2 202 


8202 


200 . 


1184 


2168 


3152 


787CIP2 203 


8205 


201 


1185 


2169 


3153 


787CIP2 204 


8206 


202 


1186 


2170 


3154 


787CIP2 205 


8207 


203 , 


1187 


2171 


3155 


787CIP2_206 


8208 


204 


1188 


2172 


3156 


787CIP2 207 


8209 


205 _ . 


J 189 


2173 


3157 


787CIP2__208 


8210 . . 


206:-:';,,-^- 


1190- 


2174 


3158 . 


787CIP2 209..' . : 


8211 - - - 


207. , 


1191 


2175 


3159 


787CIP2 210 


8212 


208 


1192 


2176 


3160 


787CIP2_211 


8213 


209 


1193 


2177 


3161 


787C1P2_212 


8214 


210 


1194 


2178 


3162 


787CIP2 213 


8215 


211 


1195 


2179 


3163 


787CIP2 214 


8216 


212 


1196 


2180 


3164 


787CIP2 215 


8217 


213 


1197 


2181 


3165 


787CIP2 217 


8221 


214 


1198 


2182 


3166 


787CIP2 218 


8222 


215 


1199 


2183 


3167 


787CIP2_219 


8223 


216 


1200 


2184 


3168 


787CIP2_^220 


8224 


217 


1201 


2185 


3169 


787CIP2 221 


8225 


218 


1202 


2186 


3170 


787CIP2 222 


8227 


219 


1203 


2187 


3171 


787CIP2 223 


8232 


220 


1204 


2188 


3172 


787CIP2 224 


8235 


221 


1205 


2189 


3173 


787CIP2_225 


8236 


222 


1206 


2190 


3174 


787CIP2 227 


8238 


223 


1207 


2191 


3175 


787C1P2 228 


8239 


224 


1208 


2192 


3176 


787CIP2 229 


8240 


225 


1209 


2193 


3177 


787CIP2 230 


8242 


226 


1210 


2194 


3178 


787CIP2_231 


8246 


227 


1211 


2195 


3179 


787CIP2_232 


8252 


228 


1212 


2196 


3180 


787C1P2_233 


8257 


229 


1213 


2197 


3181 


787CIP2 234 


8288 


230 


1214 


2198 


3182 


787CIP2_235 


8310 


231 


1215 


2199 


3183 


787CIP2__236 


8311 


232 


1216 


2200 


3184 


787CIP2 237 


8315 


233 


1217 


2201 


3.185 


787CIP2 238 


8318 
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234 


1218 


2202 


^ 1 C/? 
J 160 


787CIP2 239 


8326 


235 


1219 


2203 


"W 87 


/o7LlF2 240 


8326 


236 


1220 


2204 


^ 1 85? 
J 1 oo 


/d7CLP2 241 


8336 


237 


1221 


2205 




/o/CUrz 242 


8351 


238 


1222 


2206 


1 QO 


/o/ClKz 243 


8364 


239 


1223 


2207 


'^1 01 


/o7dr2_244 


8372 


240 


1224 


. 2208 




/o/Clrz 245 


8376 


241 


1225 


2209 




/o7CIP2 246 


8377 


242 


1226 


2210 


J lyr 


/o/CiP2 247 


8382 


243 


1227 


2211 




/0/CLP2 248 


8404 


244 


1228 


2212 




/o/LJUr'2 249 


8410 


245 


1229 


2213 


1107 


/o7ClF2 250 


8419 


246 


1230 


2214 


1105i 

J X70 


/o7L/lr2 251 


8430 


247 
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773 ] 


1754 : 

1755 : 

1756 : 

1757 : 


2738 

2739 1 

2740 ^ 

2741 " 


3722 
5723 
J724 
1725 


787CIP2C 3 
787CIP2C^4 : 
787CIP2C 5 : 
787CIP2C 6 : 
^87CIP2C__7 : 


1916 
2072 
2424 
W74 
!474 
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774 


1758 


2742 


3726 


787CIP2C 8 


2887 


775 


1759 


2743 


3727 


787CIP2C 9 


3001 


116 


1760 


2744 


3728 


787CIP2C_10 


3182 


■ 111 


1761 


2745 


3729 


787CIP2C__1 1 


3182 


lis 


1762 


2746 


3730 


787CIP2C 12 


3182 


119 


1763 


2747 


3731 


787CIP2C 13 


3193 


780 


1764 


2748 


3732 


787CIP2C 14 


3196 


781 


1765 


2749 


3733 


787CIP2C_15 


3224 


782 


1766 


2750 


3734 


787CIP2C 16 


3225 


783 


1767 


2751 


3735 


787CIP2C 17 


3234 


784 


1768 


2752 


3736 


787CIP2C 18 


3241 


785 


1769 


2753 


3737 


787CIP2C 19 


3243 


786 


1770 


2754 


3738 


787CIP2C 20 


3243 


787 


1771 


2755 


3739 


787CIP2C 21 


3259 


788 


1772 


2756 


3740 


787CIP2C 22 


3272 


789 


1773 


2757 


3741 


787CIP2C 23 


3278 


790 


1774 


2758 


3742 


787CIP2C_24 


3296 


791 


1775 


2759 


3743 


787CIP2C 25 


3327 


792 


1776 


2760 


3744 


787CIP2C 26 


3334 


793 


1777 


2761 


3745 


787CIP2C 27 


3339 


794 


1778 


2762 


3746 


787CIP2C_28 


3347 


795 


1779 


2763 


3747 


787CIP2C_29 


3387 


796 


1780 


2764 


3748 


787CIP2C 30 


3392 


797 


1781 


2765 


3749 


787CIP2C 31 


3411 


798 


1782 


2766 


3750 


787CIP2C 32 


3427 


799 


1783 


2767 


3751 


787CIP2C 33 


3432 


800 


1784 


2768 


3752 


787CIP2C_34 


3441 


801 


1785 


2769 


3753 


787CIP2C 35 


3479 


802 


1786 


2770 


3754 


787CIP2C 36 


3488 


803 


1787 


2771. . 


3755 


787CIP2C 37 


3488 


804 


1788 


2772 


3756 


787CIP2C_38 


3553 


805 


1789 


.2773 


-3757 


787CIP2C 39 ^ 


3560 


806' 


1790 






:787,CIP2C.40 


3618 . 


807 


1791 


2775 .\ . . 


3759 


787CIP2C 41 


3642 


808 


1792 


2776 


3760 


787CP2C 42 


3649 


809 


1793 


2777 


3761 


787CIP2C 43 


3676 


810 


1794 


2778 


3762 


787CIP2C 44 


3747 


811 


1795 


2779 


3763 


787CIP2C 45 


3917 


812 


1796 


2780 


3764 


787CIP2C__46 


4218 


813 


1797 


2781 


3765 


787CIP2C 47 


4219 


814 


1798 


2782 


3766 


787CIP2C 48 


4222 


815 


1799 


2783 


3767 


787CIP2C_49 


4222 


816 


1800 


2784 


3768 


787CIP2C 50 


4229 


817 


1801 


2785 


3769 


787CIP2C 51 


4230 


818 


1802 


2786 


3770 


787CIP2C 52 


4240 


819 


1803 


2787 


3771 


787CIP2C 53 


4241 


820 


1804 


2788 


3772 


787CIP2C 54 


4249 


821 


1805 


2789 


3773 


787CIP2C 55 


4252 


822 


1806 


2790 


3774 


787CIP2C_56 


4267 


823 


1807 


2791 


3775 


787CIP2C_57 


4272 


824 


1808 


2792 


3776 


787CIP2C_58 


4273 


825 


1809 


2793 


3777 


787CIP2C 59 


4275 


826 


1810 


2794 


3778 


787CIP2C 60 


4283 


827 


181J 


2795 


3779 


787CIP2C_61 


4290 


828 


1812 


2796 


3780 


787CIP2C 62 


4292 


829 


1813 


2797 


3781 


787CIP2C 63 


4305 


830 


1814 


2798 


3782 


787CIP2C 64 


4306 


831 


1815 


2799 


3783 


787CIP2C_65 


4308 


832 


1816 


2800 


3784 


787CIP2C 66 


4322 


833 


1817 


2801 


3785 


787CIP2C 67 


4351 
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834 


1818 


2802 


J / OO 


/o7Llr2C 68 


4356 


835 


1819 


2803 


J / o / 


/o7Clr2C 69 


4399 


836 


1820 


2804 


J I oo 


/S7Clr2C 70 


4400 


837 


1821 


2805 




fo/s^UrZL, 71 


4520 


838 


1822 


2806 




/o/UJJrZC 72 


4598 


839 


1823 


2807 


'?701 


/o/L/lrZC 73 


4599 


840 


1824 


2808 


370'^ 


7R'70n>0/^ lA 

/o/L/LrZC 74 


4600 


841 


1825 


2809 


^7Q^ 


/o/Clr2C 75 


4670 


842 


1826 


2810 


"^704. 


/o7dF2C 76 


4708 


843 


1827 


2811 


^7QS 

J 1 yj 


/o/drzL. 77 


4734 


"844 


1828 


2812 


J / y\j 


/o/drzC 78 


4738 


845 


1829 


2813 


J ly 1 




4749 


846 


1830 


2814 


o lyo 


/i$/Clr2C 50 


4752 


847 


1831 


2815 


J lyy 




4752 


848 


1832 


2816 


jO\J\j 


/o/CJi'2C 82 


4770 


849 


1833 


2817 • 


JOUl 




4784 


850 


1834 


2818 




/o/CjUr2C 84 


4785 


851 


1835 


2819 




/o7Clr2C 85 


4792 


852 


1836 


2820 


jO\J*t 


/o7CLP2C 86 


4803 


853 


1837 


2821 


JOKJJ 


/o/Uii'zL 87 


4811 


854 


1838 


2822 


DoVO 


757CJP2C 88 


4817 


855 


1839 


2823 


JO\J I 


/o7LIP2C 89 


4818 


856 


1840 


2824 


JoUo 


/57CIP2C 90 


4820 


857 


1841 


2825 


joKJy 


787CIP2C 91 


4831 


858 


1842 


2826 


DOIK) 


767CIP2C 92 


4841 


859 


1843 


2827 


joL 1 


787CIP2C 93 


4869 


860 


1844 


2828 


JOi/. 


7S7CIP2C 94 


4876 


861 


1845 


2829 


JOLJ 


757CIP2C 95 


4902 


862 


1846 


2830 


JOlH 


757CIP2C 96 


4910 


863 


1847 


2831 


JO AD 


787CIP2C 97 


4931 


864 


1848 


2832 


JO ID 


787GIP2C 98 


5303 


865 


1849 


2833 




787CIP2C 99 


5317 


-866 r-- - 


1850 - 


.2834 . 


'IR 1 R 


787CIP2C 100 


5322 


867- - 


1851 


2835 


joLy 


787CIP2C 101 


5330 


868 


1852 


2836 


^R7ft 


/o7ClP2C 102 


5333 


869 


1853 


2837 


J OZ. 1 


/57C1F2C 103 


5333 


870 


1854 


2838 


3822 


/o/L/IrzC 104 


5356 


871 


1855 


2839 


joZj 


787CIP2C 105 


5363 


872 


1856 


2840 


'^R94 


7o7ClF2C 106 


5364 


873 


1857 


2841 


jOZ> J 


/o/Clr'2C 107 


5379 


874 


1858 


2842 


'?R9/^ 
joZO 


/07C1P2C 108 


5386 


875 


1859 


2843 


jOZ / 


/o7dF2C 109 


5397 


876 


1860 


2844 




/o7Clr2C 110 


5401 


877 


1861 


2845 


J OiCy 


/o7Ulr2C 111 


5419 


878 


1862 


2846 


J O JV/ 


/o/UJLrzL 112 


5420 


879 


1863 


2847 


J0.7X 


/o/L-lrzC 113 


5452 


880 


1864 


2848 


3832 


787C1P2C 114 


5467 


881 


1865 


2849 


3833 


787CIP2C 115 


5482 


"882 


1866 


2850 


3834 


787CP2C 116 


5483 


883 


1867 


2851 


3835 


787CIP2C 117 


5492 


884 


1868 


2852 


3836 


787CIP2C 118 


5499 


885 
886 


1869 
1870 


2853 
2854 


3837 
3838 


787CIP2C 119 


5525 


887 

888 


1871 
1872 


2855 
2856 


3839 
3840 


787CIP2C 120 
787CIP2C 121 


5538 
5539 


889 
890 


1873 

1874 : 


2857 
2858 


3841 
3842 


787CIP2C 122 
787CIP2C 123 


5558 

5559 


891 
892 
893 


1875 : 

1876 : 

1877 : 


2859 : 

2860 " : 

2861 " 


5843 
5844 
J 845 


787CIP2C_124 t 
787CIP2C 125 t 
787CIP2C 126 t 
787CIP2C 127 f 


5586 
5619 
>628 
>640 
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894 


1878 


2862 


3846 


787CIP2C 128 


5640 


895 


1879 


2863 


3847 


787CIP2C_129 


5827 


896 


1880 


2864 


3848 


787CIP2C 130 


6094 


897 


1881 


2865 


3849 


787CIP2CJ31 


6195 


898 


1882 


2866 


3850 


787CIP2C 132 


6206 


899 


1883 


2867 


3851 


787CIP2CJ33 


6355 


900 


1884 


2868 


3852 


787CIP2C_134 


6362 


901 


1885 


2869 


3853 


787CIP2C_135 


6386 


902 


1886 


2870 


3854 


787CIP2C_136 


6431 


903 


1887 


2871 


3855 


787CIP2CJ37 


6457 


904 


1888 


2872 


3856 


787CIP2C_I38 


6480 


905 


1889 


2873 


3857 


787CIP2C_139 


6497 


906 


1890 


2874 


3858 


787CIP2C 140 


6532 


907 


1891 


2875 


3859 


787CIP2C 141 


659Z 


908 


1892 


2876 


3860 


787CIP2C 142 


6644 


909 


1893 


2877 


3861 


787CIP2C 143 


6644 


910 


1894 


2878 


3862 


787CIP2C_144 


6645 


911 


1895 


2879 


3863 


787CIP2C_145 


6645 


912 


1896 


2880 


3864 


787CIP2C_146 


6761 


913 


1897 


2881 


3865 


787CIP2C 147 


6782 


914 


1898 


2882 


3866 


787CIP2C_148 


6981 


915 


1899 


2883 


3867 


787CIP2C_149 


6981 


916 


1900 


2884 


3868 


787CIP2C_150 


7000 


917 


1901 


2885 


3869 


787CIP2C_151 


7029 


918 


1902 


2886 


3870 


787CIP2C 152 


7885 


919 


1903 


2887 


3871 


787CIP2C 153 


8143 


920 


1904 


2888 


3872 


787CIP2C_154 


8143 


921 


1905 


2889 


3873 


787CIP2C 155 


8234 


922 


1906 


2890 


3874 


787CIP2C 156 


8463 


923. 


1907 


2891* 


3875 


787CIP2C_157 


8467 . 


924 


1908 


2892 


3876 


787CIP2C 158 


8540 


925 . . . 


1909 


2893 


3877 


787CIP2C 159 


8600 . 


92&:^ • 


1910 : 


2894 


, 3.878 


787CIP2C_160r • 


9656^:?^t 


.927 


1911 


2895 


3879 


787CIP2C 161 


9669 . 


928 


1912 


2896 


3880 


787CIP2C_162 


9695 


929 


1913 


2897 


3881 


787CIP2C_163 


9744 


930 


I9I4 


2898 


3882 


787CIP2C_I64 


9849 


931 


1915 


2899 


3883 


787CIP2D_1 


4180 


932 


1916 


2900 


3884 


787CIP2D 2 


4181 


933 


1917 


2901 


3885 


787CIP2D_3 


4314 


934 


1918 


2902 


3886 


787CIP2D__4 


4500 


935 


1919 


2903 


3887 


787CIP2D 5 


5651 


936 


1920 


2904 


3888 


787CIP2D_6 


5691 


937 


1921 


2905 


3889 


787CIP2D 7 


5881 


938 


1922 


2906 


3890 


787CIP2D 8 


5882 


939 


1923 


2907 


3891 


787CIP2D 9 


6209 


940 


1924 


2908 


3892 


787CIP2D 10 


6719 


941 


1925 


2909 


3893 


787CIP2D_.11 


8130 


942 


1926 


2910 


3894 


787CIP2D_12 


8863 


943 


1927 


2911 


3895 


787CIP2D 13 


8902 


944 


1928 


2912 


3896 


787CIP2D 14 


9162 


945 


1929 


2913 


3897 


787CIP2D_15 


9197 


946 


1930 


2914 


3898 


787CIP2D_16 


9215 


947 


1931 


2915 


3899 


787CIP2D 17 


9232 


948 


1932 


2916 


3900 


787CIP2D 18 


9262 


949 


1933 


2917 


3901 


787CIP2D 19 


9369 


950 


1934 


2918 


3902 


787CIP2D_20 


9371 


951 


1935 


2919 


3903 


787CIP2D 21 


9516 


952 


1936 


2920 


3904 


787CIP2D 22 


9601 


953 


1937 


2921 


3905 


787CIP2D_23 


9731 
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954 


J 7 JO 




3906 


787CIP2D 24 


9733 


955 


1939 






787CIP2D 25 


9769 


956 


1940 


9094 




787CIP2D 26 


9804 


957 


1941 




10 AO 


787CIP2D 27 


9816 


958 


1942 




^oin 


787CIP2D 28 


9844 


959 


1943 


9097 


^yi 1 


787CIP2D 29 


9924 


^60 


1944 


909X 




787CIP2D 30 


9936 


"961 


1945 


9090 




787CIP2D 31 


10163 


962 


1946 




QOIvf 


787CIP2D 32 


10165 


963 


1947 




'20K 


787CIP2D 33 


10165 


964 


1948 


2932 


101^ 


787CIP2D 34 


10244 


"965" 


1949 




jyi / 


787CIP2D 35 


10278 


966 


1950 






767CIP2E 1 


4251 


967 


1951 


90^^ 




787CIP2E 2 


5310 


"968 


1952 


90"^^? 


oyZK) 


787CIP2E 3 


5697 


969 


1953 


90"^? 
j^yj / 


IQO 1 


787CIP2E 4 


5731 


970 


1954 


90*^8 
jcyjo 


jyZJ. 


787CIP2E 5 


5733 




971 


1955 






787CIP2E 6 


5734 


972 


1956 


90 An 

£yH\3 


jy24 


787CIP2E 7 


5740 


973 


1957 


904.1 


"3 one 


787CIP2E 8 


7657 


974 


1958 


9049 




787CIP2E 9 


9572 


975 


1959 




5yll 


787CIP2F 1 


1363 


976 


1960 


9044 


jy28 


787CIP2F 2 


4303 


977 


1961 


904^ 


3y29 


787CIP2F 3 


5760 


978 


1962 


2946 


3930 


787CIP2F 4 


5766 


979 


1963 


2947 


3931 




5767 


980 


1964 


2948 


3932 


787CIP2F 6 


5767 


981 


1965 


2949 


3933 


787CIP2F 7 


5770 


982 


1966 


2950 


3934 


787CIP2F 8 


6855 


983 
984 


1967 
1968 


2951 
2952 


3935 
3936 


7.87CIP2F 9 
787CIP2F 10 


10026 
10227 • 





TABLE 6 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Ammo acid sequence (A-Alanine OCystcine, D^Aspartic Add, 
E=Glutamic Add, F=Plienylalanine, G-Glydne, H=Histidinc, 
I=Isoleudne, K=Lysine, l>Leudne, M=Methion!nc, 
N-Asparagine, P=ProIine, Q-Glutaraine, R=Arginine, S^Serinc, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
\»possible nudeotide insertion 


2953 


A 


3 


324 


ISEHRJEASGNYJLAQRLTSSFLRGLSSWKSNPLML 

CGWmLTLTMVQGEP*GP\KGIPG\FHTNSSYPH 

WGTVAKPPAGD*DLLPAPGQEGTPLFTR*SLCTY 
CPID 


"2954 


A 


18 


467 


REELGKDLFDCTLYVLLKYDDFNADKHLALEEF 

YRAFQVIQLSLPEDQKLSITAATVGQSAVLSCAIQ 

GTLRPPnWKRNNIILNNLDLEDINDFGDDGSLYIT 

KVTTTHVGNYTCYADGYEQWQTHIFQVNVPPV 

IRVYPESQARRAG 


"2955 


A 


3 


23 


FYSAFLVADKGIVTSKHNNDTQfflWESDSNEFSV 
IADPRGNTLGRGTnT*VSIPPSL 


"2956 


A 


1 


493 


RTKTDVYILMLAVADLLLLFTLPFWAVNAVHGW 

VLGKIMCKITSALYTLNFVSGMQFLACISIDRYV 

AVTKVPSQSGVGKPCWnCFCVWMAAILLSIPQL 

VFTTVNDNARCIPIFPRYLGT^MKALIQMLEICIG 

FVVPFLIMGVCYFITARTLMKMPNIKIS 


2957 


A 


703 


302 


EETGVREKRRERMKEKMWQNVLCCTLQTAVIL 
KLFQNKVLNILKNFFLSPLDTRKNKVFKKWAGG 
PGAVAHACNPSTLGGRGGRITKSGDRDHPGQHG 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A°Alanine OCysteine» D=Aspartic Acid, 
&=Glutaroic Acid, F=Pbenyla!anine, G=Glycine, H=Histidine, 
I^lsoleucine, K-Lysine, L^Leucine, M=^Methlonine, 
N^Asparagine, P^Proline, Q=Glutamine, R-Arginine, S=Serine, 
T«Tbreonine» V^Valinc, W=Tryptoplian, Y«Tyrosine, 
X-Unknown, *^top codon, /^possible nndeotide deletion, 
V^ossible nudeotide insertion 










ETRSLPACWAQWKSLALPVSRAPGRQGSLWFP 
LP 


2958 


A 


575 


1054 


CTKCKADCDTCFNKNFCTKCKSGFYLHLGKCLD 
NCPEGLEANNHIMECVSIVHCEVSEWNPWSPCT 
KKGKTCGFBCRGTETRVREnQHPSAKGNLCPPTN 
ETRKCTVQRKKCQKGERGKKGRERKRKKPNKG 
ESKEAIPDSKSLESSKEIPEQRENKQQQ 


2959 


A 


1 


426 


LSMLSTISTEHRLSVLWPIWYCCHCPIHLSAVMC 
VLLWALSLLQSILEWMFCSFLFSDVDSDNWCQIL 
DFLTAVWLIFLI\L\nXGFTLVLLVK[ICGSQKMPL 
TRLYVTILLTGLVFLFCSLPLSIQ*FLLYWIEKDLD 
DL 


2960 


A 


1194 


852 


EKRKTSYSQCLNSKQRNVSMRPSIWIHVHLKPPC 
RLVELLPFSSALQGLSHLSLGTTLPA^*GHLRFRL 
RNLPQSLRTVILPERNEEQNLQELSHNADKYQM 
GDCCKEEIDDSIFy 


2961 


A 


274 


2250 


EKGKVKDAGAEQWISLSLSCKGSWETQFSNHLN 

SLTPPTSVRRMPLITTVTLLKMVARHHMKLLCSK 

AFSTQLQQKIFLHSQMGIHHQSVCMKLKPNTSHII 

SILMGQPMALVQLETLAPLXniQKFQTQDHMKF 

WKNLPLHSHHLTPSVPQTVIPKKTGSPEIKLKITK 

TIQNGRELFESSLCGDLLNEVQASE\Q*NQSIESRK 

EKRKKSNKHDSSRSEERKSHKIPKLEPEEQNRPN 

ERVDWSEKPREEPVLKEGSPSSANTIFCSNNGSV 

HW\FKFQVGDLVWSKVGTYPWWPCMVSSDPQL 

EVHTKINJRGAREyHVQFFSNQPERAWVHEKRV 

WEi^KGIKQYEELI^ATKQASNHSEKQkl^ 

PQREIO^QWDIGiAHAEK^ 

DKQPEEALSQAKKSVASKTEVKKTRRPRSVLNT 

QPEQTNAGEVASSLSSTEIRRHSQRRHTSAEEEEP 

PPVKIAWKTAAARKSLPASITMHKGSLDLQKCN 

MSPVVKIEQVFALQNATGDGKFIDQFVYSTKGIG 

NKTEISVRGQDRLnSTPNQRJNEKPTQSVSSPEATS 

GSTGSVEKKQQRRSIRTRSESEKSTEVVPKKBaK 

KEQVETVPQATVKTGLQKGSADRGVQGSVRFSD 

SSVSAAIEETVD 










2962 


A 


2408 


836 


SASPPPPPPPPPSRFPFSGAPGARDRSGPLGSEPQR 

NPGARPRTLEATVTPPGSVGAMSSSGLNSEKVA 

ALIQKLNSDPQFVIJVQNVGrrHDLLDICLKRATV 

QRAQHVFQHAVPQEGKPITNQKSSGRCWIFSCLN 

\nVlRLPFMKKLNIEEFEFSQSYLFFWDKVERCYFF 

LSAFVDTAQRKEPEDGRLVQFLLMNPANDGGQ 

WDMLVNIVEKYGVIPKKCFPESYTTEATRRMND 

ILNHKMREFCIRLRNLVHSGATKGEISATQDVM 

MEEIFRWCICLGNPPETFTWEYRDKDKNNKKIG 

PXimEFhnR/EQHVKPLFNMEDKICLVNDPRPQH 

KYNKLYTV\EYL\SNMVWRGEKLFYNNQPIDFLK 

KMVAAS1KDG\EAVWFGCDVGKHF\NSBXG\LSD 

MNLYDHELWGVSLK^^VINKAER\LTFGES\LMT 

HTMTFTAV/SQSRDDSGMVLFTKWXRVGEFQWG 

EDHGH\KGYLCMTD*VGSLEYVYEWAnVDRKH 

VP\EEVLAVLGAGNPFVLPAWDPMGALAE 


2963 


A 


90 


543 


RHYDSAGKITLKIAKNYLEQRAVGGASPRLAQS 

VLTCSREPILENSLTSLIEYLHNALEHDMRLRFNN 

DRMKTTIKETST*LSNSYLVFPLM*SLTYLMKMS 
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I SEQJD 
NO: 



Mediod 



Predicted 
beginning 
nucleotide 
I location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino add sequence (A=A]anine C=Cysteine, 0==A5partic Add; 
E^lntamic Add, F=PhenylaIanine, G=Glycine, H=Histidine, 
I^^olendne, K«=Lysine, JL?-Uucinc, M=Mctliionine, 
N-Asparagine, P^Proline, Q=GIutamine, R^Arginine, S^Serinc, 
T^Threonlne, V=VaIine, W=Tryptoplian, Y^Tyrosinc, 
X^Unlcoown, *^top codon, /^possible nudeotide deletion, 
V^possible nudeotide insertion 



FERCTAM^^KMFVNSPFTKVDNYCT^SSVWKiO? 
KCYFSLNTIKKEKKMT 



2454 



2965 



2454 



2966 



1693 



227 



FDTYRGLPSISNGNYSQLQFQAREYSGAPYSQRIS 

AITTVSVAWKVLSGKIGEGAEGNCKCVISEGAW 

AVCPTQPCGKAKPDKHLKDLLSKLLNSGYFESIP 

VPKNAKEKEVPLEEEMUQSEKKTQLSKTESVKE 

SESLMEFAQPEIQPQEFLNRRYMTEVDYSNKQGE 

EQPWEADYARKPNLPKRWDMLTEPDGQEKKQE 

SFKSWEASGKHQEVSKPAVSLEQRKQDTSKLRS 

TLPEEQKKQEISKSKPSPSQWKQDTPKSKAGYVQ 

EEHKKQETPKLWPVQLQKEQDPKKQTPKSWTPS 

MQSEQNTTKSWTTPMCEEQDSKQPETPKSWENN 

VESQKHSLTSQSQISPKSWGVATASLIPNDQLLPR 

KLNTEPKDVP/IACASA*GFLPLQPPFRRI/HVLRK 

EKLQDLMTQIQGTCNFMQESVLDFDKPSSAIPTS 

QPPSATPG*PRRHLKEQNLS\VKVIFFQGAV'nVF 

NVNAPLPPRKEQEIKESPYSPGYNQSFTTASTQTP 

PQCQLPSIHVEQTVHSQETANYHPDGTIQVSNGS 

LAFYPAQTOVFPRPTQPFVNSRGSVRGCTRGGRL 

ITNfSYRSPGGYKGFDTYRGLPSISNGNYSQLQFQ 

AREYSGAPYSQRDNFQQCYKRGGTSGGPRANSR 

AGWSDSSQVSSPERDNETFNSGDSGQGDSRSMT 

PVDVPVTNPAATILPVHVYPLPQQMRVAFSAAR 

TSNLAPGTLDQPIVFDLLLNNLGETFDLQLGRFN 

CPVNGTYVFJDFHMLKLAXnWPLYVNLMKNEEVL 

VSAYANDGAPDHETASNHAILQLFQGDQIWLRL 
HRGAIYGSSW 



FDTYXULFSISNGNYSQLQFQAREYSGAPYSQRIS 

AITTVSVAWKVLSGKIGEGAEGNCKCVISEGAW 

AVCPTQPCGKAKPDKHLKDLLSKLLNSGYFESIP 

VPKNAKEKEVPLEEEMLIQSEKKTQLSKTESVKE 

SESLMEFAQPEIQPQEFLNRRYMTEVDYSNKQGE 

EQPWEADYARKPNLPKRWDMLTEPDGQEKKQE 

SFKSWEASGKHQEVSKPAVSLEQRKQDTSKLRS 

TLPEEQKKQEISKSKPSPSQWKQDTPKSKAGYVQ 

EEHKKQETPKLWPVQLQKEQDPKKQTPKSWTPS 

MQSEQNTTKSWTTPMCEEQDSKQPETPKSWENN 

VESQKHSLTSQSQISPKSWGVATASLIPNDQLLPR 

KLNTEPKDVP/IACASA*GFLPLQPPFRRimLRK 

EKLQDLMTQIQGTCNFMQESVLDFDKPSSAIPTS 

QPPSATPG*PRRHLKEQNLS\VKVIFFQGAVIAVF 

NVNAPLPPRKEQEIKESPYSPGYNQSFTTASTQTP 

PQCQLPSIHVEQTVHSQETANYHPDGTIQVSNGS 

LAFYPAQTNVFPRPTQPFVNSRGSVRGCTRGGRL 

ITNSYRSPGGYKGFDTYRGLPSISNGNYSQLQFQ 

AREYSGAPYSQRDNFQQCYKRGGTSGGPRANSR 

AGWSDSSQVSSPERDNETFNSGDSGQGDSRSMT 

PVDVPVTNPAATILPVHVYPLPQQMRVAFSAAR 

TSNLAPGTLDQPIVFDLLLNNLGETFDLQLGRFN 

CPVNGTTVFnmfLKLAVNVPLYVM-MKNEEVL 

VSAYANDGAPDHETASNHABLQLFQGDQIWLRL 

HRGAIYGSSW 



DYVLTAJELHRQRSPGVSFGLSVFNLMNAIMGSGI 

LGLAYVMANTGVFGFSFLLLTVALLASYSVHLL 

LSMCIQTAYLGP*TNYFMVLPAH*LTCLPLffiFLQ 
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SEQH) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

coiresponding 

Co first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C^Cysteine, D^Aspartic Add, 
E=<?lutamic Acid, F^Phenylalanine, G=Glycine, H^Histidine, 
I»Isoleucine, K^Lysine, I^Leucine, M=Methioniiie, 
N=Asparaginc,P»Proline, Q^Glutamine, R===Ai^inine, S=Serine, 
T^Threonine^ V«VaIine, W-Tiyptophan, Y=Tyrosine, 
XsUnknown, *=Stop codon, /^possible nucleotide deletion, ' 
V=possibIe nucleotide insertion 










SL*NSL\*AVTSYEDLGLFAFGLPGKLWAG'niIQ 

NIGAMSSYLLIIKTELPAAIAEFLTGDYSRYWYLD 

GQTLLinCVGIVFPLALLPKIGFLGYTSSLSFFFM 

IvlFFALVVIIKKWSIPCPLTLNYVEKGFQISNV'roD 

CKPKLFHFSKESAYALPTMAFSFLCHTSILPIYCE 

LQSPSKKRMQNVThnrAIALSFLIYFISALFGYLTF 

YD/GTTKAQRGEWCHRIKDKVESELLKG***IP* ' 

SHD\rsrvnVlT^VKLCILFA\a.L\TWLIHFPAR^ 

NIMFFSNFPFSWIRHFLITLALMIIVLLAIYVPD 

VFGWGASTSTCLIFIFPGLFYLKLSREDFLSWKK 

LGVGCFC/LLSFKTSBLRNSLSVYIILPASRKSIYFK 

I 


2967 


A 


3 


3222 


SGIWRALWREKXPGGGRRVKRRNPGRQAVGH 

TEEDPPRVGTPWKEHTGPGPQEGSTMEAAHAKT 

TEECLAYFGVSETTGLTPDQVKRNLEKYGLNELP 

AEEGKTLWELVIEQFEDLLVRILLLAACISFVLA 

WFEEGEEHTAFVEPFVILLILIANAIVGVWQERN 

AENAIEALKEYEPEMGKVYRADRKSVQRIKARD 

rVTGDIVEVAVGDKVPADIRILAIKSTTLRVDQSIL 

TGEYVSVIKHTEPWDPRAVNQDKKNMLFSGTM 

AAGKALGIVATTGVGTEIGKIRDQMAATEQDKT 

PLQQKLDEFGEQLSKVISLICVAVWLINIGHFNDP 

VHGGSWFRGAIYYFKIAVALAVAAffEGLPAVIT 

TCLALGTIIRMAKKNAIVRSLPSVETLGCTSVICS 

DKTGTLTTNQMSVCKMFIIDKVDGDICLLNEFSIT 

GSTYAPEGEVLKNDKPVRPGQYDGLVELATICA 

LCNDSSLDFNEAKG VYEKVGEAf ETALTIliy ! 

MNVFNTDVRSLSKVnERANACNSVlRiQL]^ 

LEFSRDRKSMSVYCSPAKSSRAAVGNKMFVKGA 

PEGVIDRCNYVRVGTTRVPLTGP VKEKIMAVIKE . 

WGTGRDTLRCLALATRDTPPKREEMVLDDSARF 

LEYETDLTFVGVVGMLDPPRKEVTGSIQLCRDA 

GIRVIMITGDNKGTAIAICRRJGIFGENEEVADRA 

Y\TGREFDDL\PLAEQVREACRRACCFARVEPSHK 

SKIVEYLQSYDEITAMTGDGVNDAPALKKAEIGI 

AMGSGTAVAKTASEMVLADDNFSTIVAAVEEGR 

AIYNNMKQFIRYLISSNVGEWCIFLTAALGLPEA 

LIPVQLLWVNLVTDGLPATALGFNPPDLDIMDRP 

PRSPKEPLIVSGWLFFRYMAIGGYVGAATVGAAA 

WWFLYAEDGPHV>rYSQLTHFMQCTEDNTHFEGI 

DCEVFEAPEPMTMALSVLVTIEMCNALNSLSEN 

QSLLRMPPWVNIWLLGS JCLSMSLHFLIL YVDPLP 

IvnFKLRALDLTQWLMVLkJSLPVIGLDEILKFVA 

RNYLEG*LFPLLHL*ARVTDPEDERRK 


2968 


A 


3 


2414 


GARSCSRLGRCTFPLWKGREMEVRKLSISWQFLI 

VLVLBLQDLSALDFDPYRVLGVSRTASQADIKKA 

YKKLAREWHPDKNKDPGAEDKFIQISKAYEILSN 

EEKRSNYDQYGDAGENQGYQKQQQQREYRFRH 

FHENFYFDESFFHFPFNSERRDSrDEKYLLHFSHY 

VNEVAPDSFKKPYLIKITSDWCFSCIHIEPVWKEV 

IQELEELGVGIGWHAGYERRLAHHLGAHSTPSI 

LGIINGKISFFHNAVVRENLRQFVESLLPGNLVEK 

VTNKimaiFLSGWQQENKPHVLLFDQTPIVPLL 

YKLTAFAYKDYLSFGYVYVGLRGTEEMTRRYNI 

NIYAFILLVFKEfflNRPADVIQARGMKKQIIDDFI 
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SEQID 
NO: 



Method 



PCT/USOl/04098 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 



2969 



48 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



1117 



Ammo acid sequence {A=Alanine C<:ystelne, D=Aspartic Acid, 
BKJlutamic Acid, F^Phenylalanine, G=Glycine, H=Histidlne, 
I=Isoleucine, K=Lysine, L^Leucine, M=Mctiiionine, 
N^Asparaginc, P«Prolinc, Q=Glutamine, R=Argininc, S^erine. 
T^^Threonine, V-Valine, W«Tryptophan, Y=iyrosine, 
X«Unknown, *=Stop codon, ^«pcwsible nucleotide deletion, 
\»possible nucleotide insertion 



TONKYLLAARLTSQKLFHELCPVKRSHRQRKYC 
VVLLTAETTKLSKPFEAFLSFALAIsrrQDTVRFVH 
VYSNRQQEFADTLLPDSEAFQGKSAVSILERRNT 
AGRWYKTLEDPWIGSESDKFILLGYLDQLRKDP 
ALLSSEAVLPDLTDELAPVFLLRWFYSASDYISD 
CWDSIFHNhTW^MMPLLSLIFSALFILFGTVIVO 
AFSDSNDERESSPPEKEEAQEKTGKTEPSFTKENS 
SKIPKKGFVEVTELTDVTYTSNLVRLRPGHMNV 
VLILSNSTKTSLLQKFALEVYTFTGSSCLHFSFLSL 
DKHREWLEYLLEFAQDAAPIPNQYDKHFMERDY 
TGYVLALNGHKKYFCLFKPQKTVEEGGKP*GSC 
SDVDSSLYLGESRGKPSCGLGSRPIKGKLSKLSL 
WMERLLEGSLQRFYIPSWPELD 



KGLSPDQVLSAFAPLDCEMWLKVFTTFLSFATG 
ACSGLKVTVPSHTVHGVRGQALYLPVHYGFHTP 
ASDIQITWLFERPHTMPKYLLGSVNKSVVPDA^GI 
P/YTSSP*CHPMASLLINPLQFPDEGNYIVKVNIQG 
NGTLSASQKIQVTVDDPVTKPWQIHPPSGAVEY 
VGNMILTCHVEGGTRLAYQWLKNGRPVHTSST 
YSFSPQNNTLHIAPVTKEDIGNYSCLVRNPVSEM 
ESDnMPIIYYGPYGLQVNSDKGLKVGEVFTVDL 
GEAILFDCSADSHPPNTYSWIRRTDNTTYIIKHGP 
RLEVASEKVAQKTMDYVCCAYNNITGRQDETHF 
TVnTSVGMCDIQGRDPNKT 



68 



936 



2971 



912 



2287 



HSALL iHSSJbCVFTLCQDFFrYSSMSEEVTYADL " 

QFQNSSEMEKIPEIGKFGEKAPPAPSHVWRPAAL 

FLTLLCLLLLIGLGVLASMFHVTLKIEMK^^ 

QNISEELQRMSLQLMSNMNISNKIRNLSTTLQTI 

ATKLCRELYSKEQEHKCKPCPRRWIWHKDSCYF 

LSDDVQTWQESKMACAAQNASLLKINNKNALE 

FIKSQSRSYDYWLGLSPEEDSA^SWYESG*YNQ\P 

SAWVIRNAPDLNNMYCGYINRLYVQYYHCTYK 
QRMICEKMANPVQLGSTYFREA 



VPNYLPSVSSAIGGEVPQRYVWRFCIGLHSAPRF 

LVAFAYWNHYLSCTSPCSCYRPLCRLNFGLNVV 

ENLALLVLTYVSSSEDF/TWVPG*GRSGEVFPEGT 

GLPLPHSDLPTSWCGHSLQCGSQSSFPPAIHENAF 

IVFIASSLGHMLLTCILWRLTKKHTVSQE\DGLSL 

AGAPRQPRRKSRTSVLRIRVMVRWELSSNGNPG 

RGVLGLGLGLGNKLRWGQNLGL*HCVWVVWE 

TGE*KRWRLQMGIE*GVASRRQ*VRNSVRGLVC 

HNSSAPPMYMGFFSPTVFGGGVGG*LHVTFILHP 

PEVEAAGIPLLLGPSLPQRQGREfflVVILAAPACA 

PFHDR*WEPREIRPSP*ELGLRGEPTLSYPASCRVI 

RQPIP*DRKSYSWKQRLFIINFISFFSALAVYFRHN 

MYCEAGVYTIFAILEYTVVLTNMAFHMTAW^ 

FGNKELLITSQPEEKRF 



1734 



246 



ggilsukdgrtalprprepaertaglrrdmrpqe" 
ldarqlpawfdqakfgifihwgvfsvpsfgsewf 

WWYWQKEKIPKYVEFMKDNYPPSFKYEDFGPL 

FTAKFFNANQ\WADIFQASGAKYIVLTSKHHEGF 

TLWG\SEYSWNWNAIDEGPKRDIVKELEVAIRNR 

IDLRFGLYYSLFEWFHPLFLEDESSSFHKRQFPVS 

KTLPELYELVNNYQPEVLWSDGDGGAPDQYWN 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=AIanioe OCysteine, D=Aspartic Acid, 
EsGlutamic Acid, F-Phenylalanine, G^lycine, H-Histidine, 
Islsoleudne, K-Lysine, Lr=Leucine, M^Methionine, 
N=A5panigine, P==Proliney QsKSIutamine, R^Arginine, SsSerine^ 
T-Threonine, V=VaIine, W=Tryptophan, Y«Tyrosinc, 
X'^Unknown, *s5top codon, /^possible nudeotide ddetion, 
V=possible nucleotide insertion 










STOFLAWLYNESPVRGTVVTNDRWGAGSICKHG 

GFYTCSDRYNPGHLLPHKWENCMTIDKLSWGY 

RJIEAGISDYLTIEELVKQLVETVSCGGNLLMNIG 

PTLDGTISWFEERLRQMGSWLKVNGEAIYETHT 

WRSQNDTVTPDVWYTSKPKEKLVYAIFLKWPTS 

GQLFLGHPKAILGATEVKLLGHGQPLNWISLEQN 

GIMVELPQLTIHQMPCKWGWALALTNVI 


'2973 


A 


24 


1133 


SVPRAGGDMETGAAELYDQALLGILQHVGNVQ 

DFLRVLFGFLYRKTDFYRLLRHPSDRMGFPPGAA 

QALVLQVFKTFDHMARQDDEKRRQELEEKIRRK 

EEEEAKTVSAAAAEKEPVPVPVQEIEIDSTTELDG 

HQEVEKVQPPGPVKEMAHGSQEAEAPGAVAGA 

AEVPR\EPPILPRIQEQFQKNPDSYNGAVRENYTW 

SQDYTDLEVRVPVPKHWKGKQVSVALSSSSIRV 

AMLEENGERVLMEGKLTHKINTESSLWSLEPGK 

C\a.VNLSKVGEYWWNAILEGEEPIDIDKINKERS 

MATVDEEEQAVLDRLTFDYHQKLQGKPQSHEL 

KVHEMLKKGWDAEGSPFRGQRFDPAMFNISPGA 

VQF 


2974 


A 


271 

' ' ^r^i . 


1854 


MQFGRAHGDCVSGAQLCGCPSMDDYMVLRMIG 

EGSFGRALLVQHESSNQMFAMKEIRLPKSFSNTQ 

NSRKEAVLLAKMKHPNIVAFKESFEAEGHLYIV 

MEYCDGGDLMQKIKQQKGKLFPEDMCLNWFTQ 

MCLGYNHIHKKRVLHRDIKSKNIFLTQNGKGKL 

GDFGSARLLSNPMAFACTYVGTPYYVPPEIWEN 

LPYNNKSDIWSLGCILYELCTLKHPFQANSWKNL 

ILKYCQGCISPLPSHYSYELQFLVKQMFKRNPSH 

RPSATTLLSRGIVARLVQKCLPPEIIMEYdEEVLE 

EIKNSKHNTTPRKKTNPSRIRIALGNEASTVQ 

DRKGSHTDLESINENLVESALRRVNREEKGNKSV 

HLRKASSPNLHRRQWEKNVPNTALTALENASILT 

SSLTAEDDRGGSVKYSKNTTRKQWLKETPDTLL 

NILKNADLSLAFQTYTIYRPGS\EGFLKGPLSEETE 

ASDSVDGGHDSVILDPERLEPGLDEEDTDFEEED 

DNPDWVSELKKRAGWQGLCDR 


2975 


A 


32 


2833 


PPGEPGAGRGALSPCGPLSGPPPLPGREAGGTCG 

QPVNPVFDLSREINPQEDFELIQRIGSGTYGDVYK 

ARNWfTGELAAIKVIKLEPGEDFAVVQQEIIMMK 

D\CKHP\DIVAYF\GSYL\RRDKLWI\CMEF\CGSGS 

\LQDIYHVTGPLSELQIAYVSRETLQGLYYLHSKG 

KMHRDIKGANILLTDNGHVKLADFGVSAQITATI 

AKRKSFIGTPYWMAPEVAAVERKGGYNQLCDL 

WAVGITAffiLAELQPPMFDLHPMRALFLMTKSNF 

QPPKLKDKMKWSNSFHHFVKMALTKb«>KKR^ 

AEKLLQHPFVTQHLTRSLAIELLDKVNNPDHSTY 

HDFDDDDPEPLVAVPHRIHSTSR3SrVREEKTRSEIT 

FGQVKFDPPLRKETEPHHELPDSDGFLDSSEEIYY 

TARSNLDLQLEYGQGHQG\GYFLGANKSLLKSV 

EEELHQRGHVAHLEDDEGDDDESKHSTLKAKIP 

PPLPPKPKSEFIPQEMHSTEDENQGTIKRCPMSGSP 

\AKPSQVPPRPPPPRLPPHKPVALGNGMSSFQLNG 

ERDGSLCQQQNEHRGENLSRKEKXDVPKPISNG 

LPPTPKVHMGACFSKVFNGCPLKIHCASSWINPD 

TRDQYLIFGAEEGIYTLNLNELHETSMEQLFPRR 

CTWLYVMNNCLLSISGKASQLYSHNLPGLFDYA 
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Mctbod 



5EQID 
NO: 



2976 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 



PCT/USOl/04098 



32 



Predicted end 
nucleotide 
Jocatioo 
corresponding 
to last amino 
add residue of 
peptide 
sequence 



Amino add sequence (A=Alanine C«Cysteine, i)=Aspartic Acid, 
E«=Glutamic Acid, F=Phenylalanine, G^GIycine, H=Histidine, 
I=Isoleudne, K=Lysine. L^heudnt, M=MetJiionine, 
N=Asparagine, P-Proline, Q«<;iutainine, R=Argininc, S*Serine, 
T=Threonine, V=»Valine, W«Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /possible nudeotide deletion, 
\Fpossible nudeotide insertion 



2833 



2977 



174 



1543 



RQMQKLPVAJO'AHKIJDRILPRJCFSVSAKIPETK 
WCQKCCWRNPYTGHKYLCGALQISIVLLEWV 
EPMQKJMLIKHmFPffCPLKMFEMLVVPEQEYP 
LVCVGVSRGJFODFNQVVRFETVNPNSTSSWFTES 
DTPQTWTHVTQLERDTILVCLDCCIKIV>CLQGR 
LKSSRKLSSELTFDFRIESIVCLQDSVLAFWKHG 
MQGRSI5RSNEVTQEISDSTRIFRLLGSDRVVVLES 
RPTDNPTANSNLYILAGHENSY 



PPGEPGAGRGALSPCGPLSGPPPLPGREAGGTCG 

QPVNPVFDLSRRNPQEDFELIQRIGSGTYGDVYK 

ARIW^TGELAAIKVIKLEPGEDFAWQQEIIMMK 

D\CKHP\DIVAYF\GSYL\RRDKLWI\CMEF\CGSGS 

\LQDIYHVTGPLSBLQIAYVSRETLQGLYYLHSKG 

KMHRDIKGANILLTDNGHVKLADFGVSAQITATI 

AKRKSFIGTPYWMAPEVAAVERKGGYNQLCDL 

WAVGITAIELAELQPPMFDLHPMRALFLMTKSNF 

QPPKLKDKMKWSNSFHHFVKMALTKI^KKRPT 

AEKLLQHPFVTQHLTRSLADBLLDKVNNPDHSTY 

HDFDDDDPEPLVAVPHRIHSTSRNVREEKTRSEIT 

FGQVKFDPPLRKETEPHHELPDSDGFLDSSEEIYY 

TARSNLDLQLEYGQGHQG\GYFLGANKSLLKSV 

EEELHQRGHVAHLEDDEGDDDESKHSTLKAKIP 

PPLPPKPKSIFIPQEMHSTEDENQGTIKRCPMSGSP 

\AKPSQVPPRPPPPRLPPHKPVALGNGMSSFQLNG 

ERDGSLCQQQNEHRGENLSRKEKKDVPKPISNG 

LPPTPKVHMGACFSKVFNGCPLiOHCASSWINPD 

.TRDQYLIFGAEEGIYTLNLNELHETSMEQLFPRR 

CTWLYVMNNCLLSISGKASQLYSHNLPGLFDYA 

RQMQKLPVAIPAHKLPDRILPRKFSVSAKrPETK 

WCQKCCVVRNPYTGHKYLCGALQTSIVLLEWV 

EPMQKFMLIKHIDFPIPCPLKMFEMLVVPEQEYP 

LVCVGVSRGRDFNQWRFETVNPNSTSSWFTES 

DTPQTWTHVTQLERDmVCLDCCIKIVNLQGR 

LKSSRKLSSELTFDFRIESIVCLQDSVLAFWKHG 

MQGRSFRSNEVTQEISDSTRIFRLLGSDRVVVLES 

RPTDNPTANSNLYILAGHENSY 



YSLRKUlTFKLAGAMVfflKKGELTQEEKEUSvr 
GKGTVQEAGTLLSSKNVRVNCLDENGMTPLMH 
AAYKGKLDMCKLLLRHGADVNCHQHEHGYTA 
LMFAALSGNKDITWVMLEAGAETDWNSVGRT 
AAQMAAFVGQHDCVTIINNFFPRERLDYYTKPQ 
GLDKEPKLPPKLAGPLHKIirrimHPVKIVMLV 
NENPLLTEEAALNKCYRVMDLICEKCMKQRDM 
NEVLAMKMHYISCIFQKCINFLKDGENKLDTLIK 
SLLKG\RASDGFPVYPEKILRESIRK\FPYCEATLL 
QQLVRSL^PVEIGSDPTAFSVLTQAITGQVGFVDV 
EFCTTCGEKGASKRCSVCKMVIYCDQTCQKTHW 
FTHKKICKNLKDIYEKQQLEAAKEKRQEENHGK 
LDVNSNCVNEEQPEAEVGISQKDSNPEDSGEGK 
KESLESEAELEGLQDAPAGPQVSEE 



2978 



5177 



SDDLRTGLFQDVQDAESLKLPGVYEVLFYNETE 
DCPGMMLWRYPEPRGLTLVRITPVPFNTTEDPDI 
STADLGDVLQDPCSLEYWDELQKVFVAFREFNL 
SESKVCELQLPDINLVNDQKKLVSSDLWRIVLNS 
SQNGADDQSSASESGSQSTCDPLVTPTALAACTR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

correspODdiog 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phcnylalanine, G-Glycine, H^Histidine, 
I^Isoleucine, K«Lysine, L=Lcncine, M=Metliionine, 
N^Asparagine, F^Proline, Q=GIutamine, R^Arginine, &=Serine, 
'P^reonine, V«Valinc, W=Tryptophan, Y=Tyrosine, 
X=Unluiown, *=Stop codon, ^possible nacleotlde deletion, 
Vspossible nucleotide insertion 










VDSCFTPWFVPSLCVSFQFAHLEFHLCHHLDQLG 

TAAPQYLQPFVSDRNMPSELEYMIVSFREPHMYL 

RQWNNGSVCQEIQFLAQADCKLLECRNVTMQS 

VVKPFSIFGQMAVSSDVVEKLLDCTVIVDSVFVN 

LGQHWHSLNTAIQAWQQNKCPEVEELVFSHFV 

ICNDTQETLRFGQVDTDENELLASLHSHQYSWRS 

HKSPQLLHICIEGWGNWRWSEPFSVDHAGTFIRT 

IQYRGRTASLIIKVQQLNGVQKQIIICGRQnCSYL 

SQSIELKWQHYIGQDGQAVVREHFDCLTAKQK 

LPSYELENNELTELCVKAKGDEDWSRDVCLESK 

APEYSIVIQVPSSNSSIIYVWCTVLTLEPNSQVQQ 

RMIVFSPLFIMRSHLPDPniHLEKRSLGLSETQnP 

GKGQEKPLQNIEPDLVHHLTFQAREEYDPSDCA 

VPISTSLIKQIATKVHPGGTVNQELDEFYGPEKSL 

QPIWPYNKKDSDRNEQLSQWDSPMRVKLSIWKP 

YVRTLLIELLPWALLINESKWDLWLFEGEKIVLQ 

VPAGKniPPNFQEAFQIGIYWANTNTVHKSVAIK 

LVHNLTSPKWKDGGNGEVVTLDEEAFVDTEIRL 

GAFPGHQKLCQFCISSMVQQGIQIIQIEDKTTIINN 

TPYQIFYKPQLSVCNPHSGKEYFRVPDSATFSICP 

GGEQPAMKSSSLPCWDLMPDISQSVLDASLLQK' 

QEMLGFSPAPGADSSQCWSLPAIVRPEFPRQSVA 

VPLGNFRENGFCTRAIVLTyQEHLGVTYLTLSED 

PSPR\aiHNRCPVKMLIKENIKDIPKFEVYCKKIPS 

ECSIHHELYHQISSYPDCKTKDLLPSLLLRVEPLD 

EVTTEWSDAIDINSQGTQVVFLTGFGYVYVDW 

HQCGTVFITyAPEGKAGPILTOTNK\PEKIVT^ , 

MFITQLSLAVFDDLTHHKASAELlJU^ 

VAPGAGPLPGEEPVAALFELYCVEICCGDLQLDN 

QLYNKSNFHFAVLVCQGEKAEPIQCSKMQSLLIS 

NKELEEYKEKCFIKLCITLNEGKSILCDINEFSFEL 

KPARLYVEDTFVYYIKTLFDTYLPNSRLAGHSTH 

LSGGKQVLPMQVTQHARALVNPVKLRKLVIQPV 

NLLVSIHASLKLYIASDHTPLSFSVFERGPIFTTAR 

QLVHALAMHYAAGALFRAGWVVGSLDBLGSPA 

SLVRSIGNGVADFFRLPYEGLTRGPGAFVSGVSR 

GTTSFVKfflSKGTLTSITOLATSLARNMDRLSLDE 

EHYNRQEEWRRQLPESLGEGLRQGLSRLGISLLG 

AIAGIVDQPMQNFQKTSEAQASAGHKAKGVISG 

VGKGIMGVFTKPIGGAAELVSQTGYGILHGAGLS 

QLPKQRHQPSDWHADQAPNSHVKYVWKMLQS 

LGRPEVHMALDVVLVRGSGQEHEGCLLLTSEVL 

FWSVSEDTQQQAFPVTEIDCAQDSKQNNLLTV 

QLKQPRVACDVEVDGVRERLSEQQYNRLVDYIT 

KTSCHLAPSCSSMQIPCPVVAAEPPPSTVKTYHY 

LVDPHFAQVFLSBOnivlVKNKALRKGFP 


2979 


A 


255 


2673 


AWLFPASVLCPRCLTGSAVGSAEWKSLWLFPFS 

SRPTLGHLDSKPSSKSNMIRGRNSATSADEQPHIG 

NYRLLKTIGKGNFAK\^ARHILTGKEVAVKIID 

KTQLNSSSLQKLFREVRIMKVLNHPNIVKLFEVIE 

TEKTLYLVMEYASGGEVFDYLVAHGRMKEKEA 

RAKFRQIVSAVQYCHQKHVHRDLKAENLLLDA 

DMhnKIADFGFSNEFTFGNKLDTFCGSPPYAAPEL 

FQGKKYDGPEVDVWSLGVILYTLVSGSLPFDGQ 

M-KELRERVLRGKYRIPFYMSTOCENIXBCKFLIL 
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SEQID 
NO: 



Metbod 



Predicted 

beginning 

nucleotide 

location 

correspoDding 

to first amino 

acid residue of 

peptide 

sequence 



2980 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E°<:iutamic Add, F=Phenylalanine, G=Glycine, H=Histidinc 
I-Isoleucine, K^LysIne, L°>Uncine, M'^Methionine, 
N^Asparagine, P-ProUne, Q=Glutamine, R^^Arginine, S-Serine 
T=ThreoniBe, V=Valine, W=Tryplophan, Y=Tyrosine, 
X^Unknown, •=Slop codon, ^=possible nucleotide deletion, 
\=possible nucleotide insertion 



120 



3433 



NPSJUCCilLEQINIKDRWMNVGHE\DDELKPYGEP 
LPVDYKDPRRTELMVSMGYTREEIQDSLVGQRYN 
EVMATYLLLGYKSSELEGDTITLKPRPSADLTNS 
SAPSPSHKVQRSVSANPKQRRFSDQAGPAIPTSNS 
YSKKTQSNNAENKRPEEDRESGRKASSTAKVPA 
SPLPGLERKKTTPTPSTNSVLSTSimSRNSPLL\E 
RASL\GQGFHPEWAKTALTMPGSRASTASASAA 
VSAARPRQHQKSMSASVHPNKASGLPPTESNCE 
VPRPRQVCWGSCTAPQRVPVASPSAHNISSSGGA 
PDRTNFPRGVSSRSTFHAGQLRQVR\DQQNLPYG 
VTPASPSGHSQGRRGASGSIFSKFTSKFVRRNLNE 
PBSKDRWETLRPHVWNSGGNDKEKBEFREAKPR 
SLRFTWSMKTTSSMEPNEMMREIRKVLDANSCQ 
SELHEKYMLLCMHGTPGHEDFVQWEMEVCKLP 
RLSLNGVRFKRISGTSMAFKhriASKIANELKT. 



2981 



120 



3433 



NCjLLLyAKUl-HGEffiDLQQWLrDTERHLLASKP 
LGGLPETAKEQLNVHMEVCAAFEAKEETYKSLM 
QKGQQMLARCPKSAETNIDQDINNLKEKWESVE 
TKLNER\K1\KLEEAL>JLA\MEFHNSL\QDFINWLT 
QAEQTLNVASRPSLILDTVLFQEDEHKVFANEVN 
SHREQIIELDKTGTHLKYFSQKQDVVLIKNLLISV 
QSRWEKWQRLVERGRSLDDARKRAKQFHEAW 
SKLMEWLEESEKSLDSELEIAMDPDKIKTQLAQH 
KEFQKSLGAKHSVYDTTNRTGRSLKEKTSLADD 
NLKLDDMLSELRDKWDTICGKSVERQNKLEEA\ 
LLFSGQFTDALQALIDWLYRVEPQLAEDQPVHG 
.proLVMNLIDNHKAFQKELGKRTSSVQALKRSA 
RELmGSRDDSSWVKVQMQELSTRWETVCALSIS 
KQTRLEAALRQAEEFHSWHALLEWLAEAEQTL 
RFHGVLPDDEDALRTLIDQHKEFMKKLEEKRAE 
LNKATTMGDTVLAICHPDSITTKHWmiRARPEE 
VLAWAKQHQQRLASALAGLIAKQELLEALLAW 
LQWAEITLTDKDKEVIPQEIEEVKALIAEHQTFM 
EEMTRKQPDVDKVTKTYKRRAADPSSLQSmPV 
LDKGRAGRKRFPASSLYPSGSQTQIETKNPRVNL 
LVSKWQQVWLLALERRRKLNDALDRLEELREF 
AOTDFDIWRKKYMRWMNHKKSRVMDFFRRIDK 
DQDGKITRQEFIDGILSSKFPTSRLEMSAVADIFD 
RDGDGYIDYYEFVAALHPNKDAYKPITDADKIE 
DEVTRQVAKCKCAKRFQVEQIGDNKYRFFLGNQ 
FGDSQQLRLVRILRSTVMVRVGGGWMALDEFL 
VKNDPCRAKGRTNMELREKFILADGASQGMAA 
FRPRGRRSRPSSRGASPNRSTSVSSQAAQAASPQ 
VPATTTPKILHPLTRNYGKPWLTNSKMSTPCKAA 
ECSDFPVPSAEGTPIQGSKLRLPGYLSGKGFHSGE 
DSGLITTAAARVRTQFADSKKTPSRPGSRAGSKA 
GSRASSRRGSDASDFDISEIQSVCSDVETVPQTHR 
PTPRAGSRPSTAKPSKIPTPQRKSPASKLDKSSKR 



NCLLLyAKUl-liGEIEDLQQWLTDTERHLLASKP 
LGGLPETAKEQLNVHMEVCAAFEAKEETYBKLM 
QKGQQMLARCPKSAETNIDQDINNLKEKWESVE 
TBa.NER\KTMCLEEALNLA\MEFHNSL\QDFINWLT 
QAEQTLNVASRPSLILDTVLFQIDEHKVFANEVN 
SHREQIIELDKTGTHLKYFSQKQDWLIKNLLISV 
QSRWEKWQRLVERGRSLDDARKRAKQFHEAW 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanioe C==Cysteine, D=Aspartic Add, 
E='Glutamic Acid, F=Pheoylalanine, G==Glycine, H-Histidine, 
I-Isoleudne, K-Lysine> I^Leudnc, M^Methionine, 
N»A5paraginei P^roline, Q^Glutamlne, R^Arginine, S=^erine, 
T»Threonine, V«Valine, W^Tryptoplian, Y=Tyrosinc, 
X'Unknown, *BStop codon, /^possible nudeotide deletion, 
^possible nudeotide insertfon 










SKLMEWLEESEKSLDSELEIANDPDKIKTQLAQH 

KEFQKSLGAKHSVYDTTNRTGRSLKEKTSLADD 

NLKLDDMLSELRDKWDTICGKSVERQNKLEEAX 

LLFSGQFTDALQALIDWLyRVEPQLAEDQPVHG 

DIDLVMNOLIDNHKAFQKELGKRTSSVQALKRSA 

RELffiGSRDDSSWVKVQMQELSHlWETVCALSIS 

KQTRLEAALRQAEEFHSWHALLEWLAEAEQTL 

RFHGVLPDDEDALRTLIDQHKEFMKKLEEICRAE 

LNKATTMGDTVLAICHPDSITTIKHWITIIRARF 

VLAWAKQHQQRLASALAGLIAKQELLEALLAW 

LQWAETTLTDKDKEVIPQEIEEVKALIAEHQTFM 

EEMTRKQPDVDKVTKTYKRRAADPSSLQSHIPV 

LDKGRAGRJOOTASSLYPSGSQTQIETKNPRVNL 

LVSKWQQVWLLALERRRKLNDALDRLEELREF 

ANFDFDIWRJKKYMRWMNHKKSRVMDFFI^^ 

DQDGKITRQEFIDGILSSKFPTSRLEMSAVADIFD 

RDGDGYIDYYEFVAALHPNKDAYKPITDADKIE 

DEVTRQVAKCKCAKRFQVEQIGDNKYRFFLGNQ 

FGDSQQLRLVRILRSTVMVRVGGGWMALDEFL 

VKNDPCRAKGRTNMELREKFILADGASQGMAA 

FRPRGRRSRPSSRGASPNRSTSVSSQAAQAASPQ 

VPATTTPKILHPLTROTGKPWLTNSKMSTPCKAA 

ECSDFPVPSAEGTPIQGSKLRLPGYLSGKGFHSGE 

DSGLITTAAARVRTQFADSKKTPSRPGSRAGSKA 

GSRASSRRGSDASDFDISEIQSVCSDVETVPQTHR 

PTPRAGSRPSTAKPSKIPTPQRKSPASKLDKSSKR 


29.82 . 








7MAAGGAEGGSGPGAAMGDCAEIKSQFRTREGF 
YKLLPGDG^dUR^GPASAQTPVPPQPPQPPPGPA 
SASGPGAAGPASSPPPAGPGPGPALPAVRLSLVR 
LGEPDSAGAGEPPATPAGLGSGGDRVCFNLGRE 
LYFYPGCCRRGSQRWHTPLTPFLPPLKSEDLNKPI 
DKRTYKGTQPTCHDFNQFTAATETISLLVGFSAG 
QVQYLDLIKKDTSKLFNEERLIDKTKVTYLKWLP 
ESESLFLASHASGHLYLYNVSHPCASAPPQYSLL 
KQ\AWGFSFYAAKSKAPRNPLAKWAVGEGPLNE 
FAFSPDGRHLACVSQDGCLRVFHFDSMLLRGLM 
KSYFGGLLCVCWSPDGRYVVTGGEDDLVTVWS 
FTEGRVVARGHGHKSWVNAVAFDPYTTRAEEA 
ATAAGADGERSGEEEEEEPEAAGTGSAGGAPLSP 
LPKAGSITYRFGSAGQDTQFCLWDLTEDVLYPHP 
PLARTRTLPGTPGTTPPAASSSRGGEPGPGPLPRS 
LSRSNSLPHPAGGGKAGGPGVAAEPGTPFSIGRF 
ATLTLQERRDRGAEKEHKRYHSLGNISRGGSGG 
SGSGGEKPSGPVPRSRLDPAKVLGTALCPRIHEV 
PLLEPLVCKKIAQERLTVLLFLEDCETACQEGLIC 
TWARPGKAFTDEETEAQTGEGSWPRSPSKSWE 
GISSQPGNSPSGTW 


2983 


A 


3855 


220 


RRFRLSAHRAQPCCRCRGLEMPRGVFQQLSNLV 

LQELNANLSNLTSAFEKATAEKIKCQQEADATN 

RVILLANRLVGGLASENIRWAESVENFRSQGVTL 

CGDVLLISAFVSYVGYFTKKYKNELMEKFWPYI 

HNLKVPIPITNGLDPLSLLTDDADVATWNNQGLP 

SDRMSTENATELGNTERWPLIVDAQLQGIKWIKN 

KYRSELKAIRLGQKSYLDVIEQATSEGDTLLIENI 

GETVDPALDPLLGRNTIKKGKYIKIGDKEVGVPP 
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S£Qn> 
NO: 


1 Method 


Predicted 

beginning 
nucleotide 
location 
corresponding 
to iirst amino 
acid residue of 
peptide 
seqoence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A==Alanine C=Cysteine, D=Aspartic Acid, " 

vauuiujii. Auu, r— rDcnyiaianine, tjr— Olycine, H=Histidine- 
t>IsoIeucine, K=Lysioe. U=Leucine, M=Methionine, 
N=^p«ragiBe,P-Proline, Q=Glutaiiiine, R=Arginine, S=Serine, 
T^^Threonine, V=V»line, W^Tryptophan, Y=Tyrosine, 
X-UDknown, *=Stop codon, A=p<»sible nucleotide delettoD. 
V^possiUe nucleotide insertion 










QVPPDPTHQVLQPTLQARDAGSVmLINFLVTRD " 

GLEDQLLAAWAKERPDLEQLKA>(LTKSQNEFK 

IVLKELEDSLLARLSAASGNFLGDTALVENLETT 

KHTASEIEEKWEAKITEVKINEARENYRPAAER 

ASLLYFILNDLNKINPyYQFSLKAFNWFEKAIQR 

TTPANEVKQRVINLTDEnYSVYMYTARGLFERD 

KLIFLAQVTFQVLSMKKELNPVELDFLLRFPFKA 

GWSPVDFLQHQGWGGIKALSEMDEFKNLDSDI 

EGSAKRWKKLVESEAPEKEIFPKEWKNKTALQK 

LCMVRCLRPDRMTYAIKNFVEEKMGSKFVEGRS 

VEFSKSYEESSPSTSIFFILSPGVDPLKDVEALGKK 

LGFTIDNGKLinWSLGQGQEVVAENALDVAAEK 

GHWVlLQNIHLVARWLGTLDKKLERySTGRHED 

YRVFIRAEPAPSPETHnPQGILENAIKITNEPPTGM 

YANLYKALDLFTQDTLEMCTKEMEFKCMLFAL 

CYFHAVVAERRKFGAQGWNRSYPFNNGDLTISI 

NVLYNYLEANPKVPWDDLRYLFGEIMYGGHITD 

DWDRRLCRTYLAEYIRTEMLEGDVLLAPGFQIPP 

NLDYKGYHEYIDENLPPESPYLYGLHPNAEIGFL 

TVTSEKLFRTVLEMQPKETDSGAGTGVSREEKV 

B0\VLDDILEKIPETFN1VIAEIMAKAAEKTPYVVV 

AFQECERMNILTNEMRRSLKELNLGLKGELTITT 

DVEDLSTALFYDTVPDTWVARAYPSMMGLAAW 

YAha.LLRIRELEAWTTDFALPTTVWLAGFFNPQS 

FLTAIMQSMARK]^WPLDKMCLSVEVTKKMRE 

DMTAPPREGSYVYGLFMEGARWDipTGVIAEA 

RIJKELTPAMPVMKAIPVARMETKNIYECPVYKT 

RmOPTYVWTFNLKTKEKAAKWILAAVALLLOV 


2984 


A 


2 


1464 


i- VLFPGIAME JPOASASSLLLPAASRPPRKREAGE 

AGAATSKQRVLDEEEYIEGLQTVrQRDFFPDVEK 

LQAQKEYLEAEENGDLERMRQIAKFGSALGKM 

SREPPPPYVTPATFETPEVHAGTGWGNKPRPRG 

RGLEDGEAGEEEEKEPLPSLDVFLSRYTSEDNAS 

FQEIMEVAKERSRARHAWLYQAEEEFEKRQKDN 

LELPSAEHQAIESSQASVETWKYKAKNSLMYYP 

EGVPDEEQLFKKPRQVVHKNTRFLRDPFSQALSR 

CQLQQAAALNAQHKQGKVGPDGJCELEPQESPRV 

GGFGFVATPSPAPGVNESPMMTWGEVENTPLRV 

EGSETPYVDRTPGPAFKDLEPGRRERLGLKMANE 

AAAKNRAJCKQEALRRVTENLASLTPKGLSPAMS 

PALQRLVSRTASKYTDRALRASYTPSPARSTHLK 

NPGPVGCRPPQSTPGA/PGSATRTPL\TQDPA\SIT 

DNLLQLPARRKASDFF 


2985 


A 


1890 


178 

i 
{ 
< 
1 
( 


ASTQEAGLLSPPGVGAQRCWNFVACLPVRACAD " 

MASNDYTQQATQSYGAYPTQPGQGYSQQSSQP 

YGQQSYSGYSQSTDTSGYGQSSYSSYGQSQNSY 

GTQSTPQGYGSTGGYGSSQSSQSSYGQQSSYPGY 

GQQPAPSSTSGSYGSSSQSSSYGQPQSGSYSQQPS 

yGGQQQSYGQQQSYNPPRGYGQQNQYNSSSGG 
"lOfir? r*r?/^^ Q vi^r^Fi/^c o\ >ro ^ c o ^ ^ ^ « 
avjuuouuuD I vJV^Ui^ooMdubGGGGGGGGGGGS 

3GGGGyGNQDQTGAAGSRGYRQ\QDRGGRCRG 

3SGGGGS\GGAAGYNRSSGGYEPRGRGGGRGGR 

3GMGGSDRGGFNKFGGPRDQGSRHDSEQDNSD 

^INTIFVQGLGENVTffiSVADYFKQIGIIKTNKKTG 

JPMINLYTDRETGKLKGEAivSFDDPPSAKAAID 
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SEQID 
NO: 


Method 


Predicted 
begin Ding 
nucleotide 
location 
corresponding 
to first amino 
add residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=A]anine C=Cysteine, D=Aspartic Add, 
£=CIutamic Acid, F^Pbenylalanine, G=Giycine, H=Histidine, 
I'^Isoleudne, K-Lysine, L=Leudne, M=Methionine, 
N^'Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serinc, 
T=ThreoQine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X<=Unknown, *»StDp codon, A>possible nudcotide deletion, 
V=possible nucleotide insertion 










WFDGKEFSGNPIKVSFATRRADFNRGGGNGRGG 

RGRGGPMGRGGYGGGGSGGGGRGGFPSGGGGG 

GGQQRAGDWKCPNPTCENMNFSWRNECNQCK 

APKPDGPGGGPGGSHMGGNYGDDRRGGRGGYD 

RGGYRGRGGDRGGFRGGRGGGDRGGFGPGKM 

DSRGEHRQDRRERPY 


2986 


A 


1890 


178. 


ASTQEAGLLSPPGVGAQRCWNFVACLPVRACAD 

MASNDYTQQATQSYGAYPTQPGQGYSQQSSQP 

YGQQSYSGYSQSTDTSGYGQSSYSSYGQSQNSY 

GTQSTPQGYGSTGGYGSSQSSQSSYGQQSSYPGY 

GQQPAPSSTSGSYGSSSQSSSYGQPQSGSYSQQPS 

YGGQQQSYGQQQSYNPPRGYGQQNQYNSSSGG 

GGGGGGGGSYGQDQSSMSGSGGGGGGGGGGGS 

GGGGGYGNQDQTGAAGSRGYRQ\QDRGGRCRG 

GSGGGGS\GGAAGYNRSSGGYEPRGRGGGRGGR 

GGMGGSDRGGFNKFGGPRDQGSRHDSEQDNSD 

NhTTIFVQGLGEhrVTIESVADYFKQIGIIKTNKKTG 

QPMINLYTDRETGKLKGEATVSFDDPPSABCAAID 

WFDGKEFSGNPDCVSFATRRADFNRGGGNGRGG 

RGRGGPMGRGGYGGGGSGGGGRGGFPSGGGGG 

GGQQRAGDWKCPNPTCENMNFSWRNEGNQCK 

APKPDGPGGGPGGSHMGGNYGDDRRGGRGGYD 

RGGYRGRGGDRGGFRGGRGGGDRGGFGPGKM 

DSRGEHRQDRRERPY 


2987 


A 


1376 


898 

. V, 


GGAKAGGAPHPFTLPFRHVGGLSAAPEEVEGML 
WAGARQHGRNWRXRETSPGTQGPLPPVPRA/PP 
GPDG\PHAIAPl[isWAIPRQQGSP^ . 
RCSGPltfGDRAPESCFPGACSVSGACAFKGTRPA 
CPPQEPSLRSSRNRLREGQTFGRMEI 


2988 


A 


1 


1011 


MGNDSVSYEYGDYSDLSDRPVDCLDGACLAIDP 

LRVAPLPLYAAIFLVGVPGNAMVAWVAGKVAR 

RRVGATWLLHLAVADLLCCLSLPILAVPIARGGH 

WPYGAVGCRALPSULLTMYASVLLLAALSADLC 

FLALGPAW\CLRFS/GACGVQVACGAAWTLALL 

LTVPSAIYRRLHQEHFPARLQCVVDYGGSSSTEN 

AVTAIRFLFGFLGPLVAVASCHSALLCWAARRC 

RPLGTAIWGFFVCWAPYHLLGLVLTVAAPNSA 

LLARALRAEPLIVGLALAHSCLNPMLFLYFGRAQ 

LRRSLPAACHWALRESQGQDESVDSKKSTSHDL 

VSEMEV 


2989 


A 


27 


4074 


KSQLFCFWVGKAGDILSGDQDKEQKDPYFVETP 

YGYQLDLDFLKYVDDIQKGNTIKRLNIQKRRKPS 

VPCPEPRTTSGQQGIWTSTESLSSSNSDDNKQCP 

NFLIARSQVTSTPISKPPPPLETSLPFLTIPENRQLP 

PPSPQLPiaiNLHVTKTLMETOiaa.EQERATMQM 

TPGEFRRPRLASFGGMGTTSSLPSFVGSGNHNPA 

KHQLQNGYQGNGDYGSYAPAAPTTSSMGSSIRH 

SPLSSGISTPVTNVSPMHLQHIREQMAIALKRLKE 

LEEQVRTIPVLQVKISVLQEEKRQLVSQLKNQRA 

ASQINVCGVRKRSYSAGNASQLEQLSRARRSGG 

ELYTDYEEEEMETVEQSTQRIKEFRQLNTADMQA 

LEQKIQDSSCEASSELRENGECRSVAVGAEENMN 

DIVVYHRGSRSCKDAAVGTLVEMRNCGVSVTEA 

MLGVMTEADKEmLQQQTffiSLKEKIYRLEVQLR 

ETTHDREMTKLKQELQAAGSRKKVDKATMAQP 
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SEQW 
NO: 


1 Mediod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 

1 sequence 


Amino acid sequence (A=Alanine OQysteine, D^Aspartic Acid, 

E^GIntamic Acid. F^Phenvlalaninp ^^sf^lvcSn^ u;«*£<i:«^ 

Msolendnc, K^Lysine, L^Lencine, M=«Mctliionine, 
N-Asparaginc, P«Proline, Q=Glutamlne, R-Arginine, S=Serine, 
T^Threoninc, V«Valine, W«Tryptophan, Y«Tyrosine, 
X^Unknown, *=Stop codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion 










LVFSKWEAWQTRDQMVGSHMDLVDTCVGTS ' 

VETNSVGISCQPECKl^WGPELPMNWWIVKER 

VEMHDRCAGRSVEMCDKSVSVEVSVCETGSNTE 

ESVNDLTLLKTNLNLKEVRSIGCGDCSVDVTVCS 

PKECASRGVNTEAVSQVEAAVMAVPRTADQDT 

STDLEQVHQFTNTETATLIESCTNTCLSTLDKQTS 

TQTVETRTVAVGEGRVKDINSSTKTRSIGVGTLL 

SGHSGFDRPSAVKTBCESGVGQININDNYLVGLK 

MRTIACGPPQLTVGLTASRRSVGVGDDPVGESLE 

NPQPQAPLGMMTGLDHYIERIQKLLAEQQTLLA 

ENYSELAEAFGEPHSQMGSLNSQLISTLSSINSVM 

KSASTEELRNPDFQKTSLGKITGSYLGYTCKCGG 

LQSGSPLSSQTSQPEQEVGTSEGKPISSLDAFPTQ 

EGTLSPVNLTDDQIAAGLYACTNNESTLKSIMKK 

KDGNKDSNGAKKNLQFVGINGGYETTSSDDSSS 

DESSSSESDDECDVIEYPLEEEEEEEDEDTRGMAE 

GHHAVNIEGLKSARVEDEMQVQECEPEKVEIRE 

RYELSEKMLSACNLLKNTINDPKALTSKDMRFC 

LNTLQHEWFRVSSQKSAIPAMVGDYIAAFEAISP 

DVLRYVINIjVDGNGNTALHYSVSHSNFEIVKLLL 

DADVCNVDHQNKAGYTPIMLAALAAVEAEKDM 

RIVEELFGCGDVNAKASQAGQTALMLAVSHGRI 

DMVKGLLACGADVNIQDDEGSTALMCASEHGH 

. VEIVKLLLAQPGCNGHLEDNDGSTALSIALEAGH 

KDIAVLLYAHVNFAKAQSPGTPRLGRKTSPGPTH 
RGSFD 


^990 


A 


69, 1 


-1687 . ;r 


.JERLRPGQRMRGPVPAAGACASLPPRAGPAQGRH" 
AALGGAEPGSHLHCGVRLQRREEPGGQQRLLPQ 
RGGSAQTGHQHPGPYECQCPGPQPGGTTPALLSL 
DLEETRGPPASANPDKDHSTQPGTMGRKKIQISRI 
LDQRmQVTFTKRKFGLMKKAYELSVLCDCEIA 
LIIFNSATRLFQYASTDMDRVLLKYTEYSEPHESR 
TNTDDLETLKRRGIGLDGPELEPDEGPEEPGEKFR 
RLAGEGGDPALPRPRLYPAAPAMPSPDV VYGAL 
PPPGVCDPSGLGEALPAQSRPSPFRPAAPKAGPPG 
LGHPLFSPSHLTSKTPPPLYLPTEGRRSDLPGGLA 
GPRGGLNTSRSLYSGLQNPCSTATPGPPLGSFPFL 
PGGPPVGAEAWARRVPQPAAPPRRPPQSSIKSER 
LFLRPPGAPATFLRPSPIPCSSPGPWQSLCGLGPP\ 
CAGCPWPTAGPGRRSPGGTSPERSPGTARARGDP 
\TSLQAFSEKTHTVTAPLRGGGLEVGGWTQSSAG 
GLLSFFLFVCISTNKNARGVRGPEKK 


"2991 T 
"2992 h 


A 


3 

* 1 ] 


1159 

] 

[636 ] 


IPQPLHCASPKEEMSLRCGDAARTLGPRVFGRYF 
CSPVRPLSSLPDKKKELLQNGPDLQDFVSGDLAD 
RSTWDEYKGNLKRQKGERLRLPPWLKIIEIPMGK 
NYNKLKNTLRNLNLHTVCEEARCPNIGECWGGG 

eyatatatimlmgdtctrgcrfcsvktarnppp 
ldasepyntakaiaewgldywltsvdrddmp 

DGGAEHIAKTVSYLKERNPKILVECLTPDFRGDL 
rw/xLCirv V x\ijo\jliU V i /VrtlV Vii 1 V JrxlL»QoJv V RDPRA 

NFDQSLRVLKHAKKVQPDVISKTSIMLGLGENDE 
QVYATMKALREADVDCLTLGQYMQPTRJRHLKV 
EEYITPEKFKYWEKVGNELGFHYTASGP\LVRSS 
yKAGEFFLBO^VAKRKTKDL 

PVPGVPTSPPSCCPQDMOGPWVLLLLGLRLQLSL 



225 



wo 01/57190 



PCTAJSOl/04098 



SEQID 


Method 


Predicted 
beginning 
nndeotide 
location 
corresponding 
to first amino 
add residue of 
peptide 
sequence . 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C<=Cy5teine» I^'Aspartic Acid, 
EXSlutamic Acid, F-Phenylalanine, G=Glycinc, H-Histidine, 
l^Isoleudne, K^Lysloe, Lr^Leucine, M=Methionine, 
^NAsparagiDe, P^ProIlne, (M^Iutamine, R^Arginine, S^Serine, 
T=Thrconlne, V=Valine, W^Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
^F=po5sible nucleotide insertion 










GVIPAEEENPAFWNRQAAEALDAAKKLQPIQKV 

AKNLILFLGDGLGVPTVTATRILKGQKNGKLGPE 

TPLAMDRFPYLALSKTYNVDRQVPDSAATATAY 

LCGVKANFQTIGLSAAARFNQCNTTRGNEVISV 

MNRAKQAGKSVGWTTTRVQHASPAGTYAHTV 

NRNWYSDADMPASARQEGCQDIATQLISNMDID 

VILGGGRKYMFPMGTPDPEYPADASQNGIRLDG 

KNLVQEWIJVKHQGAWYVWNRTELMQASLDQS 

VTHLMGLFEPGDTKYEIHRDPTLDPSLMEMTEA 

ALRLLSRNPRGFYLFVEGGRIDHGHHEGVAYQA 

LTEAVMFDDAIERAGQLTSEEDTLTLVTADHSH 

VFSFGGYTLRGSSIFGLAPSKAQDSKAYTSILYGN 

GPGYVFNSGVRPDVNESESGSPDYHQQAGWPLS 

SETHGGEDVAVFARGPQAHLVHGVQEQSFVAH 

VMAFAACLEPYTACDLAPPACTTDAAHPVAASL 

PLLAGTLLLLGASAAP 


2993 


A 


3 


685 


DAWARLLKMNRLFGKAKPKAPPPSLTDCIGTVD 

SRAESrokKISRLDAELVKYKDQIKKMREGPAKN 

MVKQKALRVLKQKRMYEQQRDNLA\NSHSTW\ 

TS\HYTIQSLKDTXTTVDAMKLGVKEMKKAYKQ 

VKIDQIEDLQDQLEDMMEDANEIQEALSRSYGTP 

ELDEDDLEAELDALGDELLADEDSSYLDEAASA 

PAIPEGVPTDTKNKDGVLVDEFGLPQIPAS 


2994 


A 


1710 • 


161 


RRCELTPFUKTLILPKSWGAFPEDWMQHVSSSQ 
SSQRHVQWPGACPGAGEEQPACSQPSLPLTLPSP 
SHQLQQLMVRGGPAGGQNMNVDLQGVGPGLQ 
' GSPQVTLAPIiPLPSPTSPGFQFSAQPRRFEHGSPS 
YIQVTSPLSQCJVQTQSPTQPSPGPGQALQNVRAG 
APGPGLGLCSSSPTGDFVDASVLVRQISLSPSSGG 
HFVFQDGSGLTQIAQGAQVQLQHPGTPITVRERR 
PSQPHTQSGGTIHHLGPQSPAAAGGAGLQPLASP 
SHITTANLPPQISSnQGQLVQQQQVLQGPPLPRPL 
GFERTPGVLLPGAGGAAGFGMTSPPPPTSPSRTA 
VPPGLSSLPLTSVGNTGMKKVPKKLEEIPPASPE 
MAQMRKQCLDYHHQEMQALKEVFiCEYLIELFF 
LQHFQGNMMDFLAFKERLYGPLQAYLRQNDLDI 
EEEEEE\HFEVINDEVKWARKHGQPGTPVAIA'n 
QLPPRTSAAFPAQQQPLQVLSDGSTVQLPRLSSL 
GFEDSMC 


2995 


A 


3 • 


924 


SAPSGEDASTHAFARCKHPINVRRDPSIPIYGLRQS 

ILLNTRLQDCYVDSPALT^«WMARTCAKQNINAP 

APATTSSWEWRNPLIASSFSLVKLVLRRQLKNK 

CCPPPCKFGEGKLSKRLKHKDDSVMKATQQARK 

RNFISSKSKQPAGHRRPAGGIRESKESSKEKKLTV 

RQDLEDRYAEHVAAT\QALPQDSGTAAWKG\RV 

LLPETQKRQQLSEDTLTIHGLPTEGYQALYHAW 

EPMLWNPSGTPKRYSLELGKAKQKLWEALCSQ 

GAISEGAQRDRFPGRKQPGVHEEPVLKKWPKLK 

SKK 


2996 


A 


3 


1713 


GKFGIKPSQRRISGKSTFHSEMEGEDTRDDSLYSI 

LEELWQDAEQIKRCQEKHNKLLSRTTFLNKKILN 

TEWDYEYKDFGKFVHPSPNLILSQKRPHKRDSFG 

KSFKHNLDLH]HNKSNAAK2s[LDKTIGHGQVFTQ 

NSSYSHHENTHTGVKFCERNQCGKVLSLKHSLS 

QNVKFPIGEKANTCreFGKIFTQRSHFFAPQKIHT 
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SEQW 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A-Alanioe OCysfeine, D=Aspartic Acid " 
Ei~^viu»inic Acia, r^rnenyiaianine, u=Glyaiic, H^Histidine 
h'boleuciDe, K=Lysiiie, L^Leudne, M=Metbionine, 
N=Asparagine, P=Proline, Q=Glutainiiie, R=Arginine, S°°Serine, 
T=nireoiilne, V-Valine, W=Tr}rptophan, Y=TyrosiBe, 
X=UBknomi, *-Stop codon, /c^possible nucleotide deletion, 
V^possiblc nucleotide iosertioo 










VEKPHHLSKCVNVFTQKPLLS1YLRVHRDEKLYI\ " 

CTKl^CGKGUJPRNSELIMHEKTHTREKPYKCNE 

\CGKSFFQVSSLLRHQrrHTGEKLFECSECGKGFS 

LNSALNIHQKIHTGERHHKCSECGKAFTQKSTLR 

MHQRIHTGERSYICTQCGQAFIQKAHLIAHQRIH 

TGEKPYBCSDCGKSFPSKSQLQMHKRIHTGEKPY 

ICTECGKAFTNRSNLNTHQKSHTGEKSYICAEGG 

KAFTDRSNFNKHQTIHTGEKPYVCADCGRAFIQK 

SELITHQRIHTTEKPYKCPDCEKSFSKKPHLKVHQ 

RIHTGEKPYICAECGKAFTDRSNFNKHQTIHTGD 

KPYKCSDCGKGFTQKSVLSMHRirarr 


2997 


A 


3 


1763 


AASTRTMGSKHFEGIYDHVGHFGRFQRVLYFICA 

FQNISCGIHYLASVFMGVTPHHVCRPPGNVSQW 

FHNIJSNWSLEDTGALLSSGQKDYVTVQLQNGEI 

WELSRCSRNKRENTSSLGYEYTGSKKEFPCVDG 

YIYDQNTWKSTAVTQWNLVCDRKWLAMLIQPL 

FMFGGPTGIGA^TFGYF\SDRLGRRVVLWATSSS 

MFLFGIAAAFAVDYYTFMAARFFLAMVASGYLV 

VGFVYVMEnOMKSRTWASVHLHSFFAVGTLLV 

ALTGYLVRTWWLYQMILSTVTVPFILCCWVLPE 

TPFWLLSEGRYEEAQKMVDIMAKWNRASSCKLS 

ELLSLDLQGPVSNSPTEVQKHNLSYLFYNWSITK 

RTLTVWLIWFTGSLGFYSFSLNSVNLGGNEYLNL 

FLLGVVEIPAYTFVClAMDKVGRRTVLAYSLFas 

ALACGWMVDPQKHYILGWTAMWGKILPIGAA 

FG\LIYLYTAELYPTIVRSLAVGSGSMVCRLASE. 

APFSVDLS.SIWIFIPQLFVGTMALLSGVLTLKLPE 

TLGKRLATTWEEAAKLESENESKSSKLLLTTNNS 

GLEKTEAITPRDSGLGE 


2998 


A 


3 


1441 


QRPASQLLAPFAAEALPGAPRAAMAQHFSLAAC 

DVVGFDLDHTLCRYNLPESAPLIYNSFAQFLVKE 

KGYDKELLNVTPEDWDFCCKGLALDLEDGNFL 

KIANNGTVLRASHGTKMMTPEVLAEAYGKKEW 

KHFLSDTGMACRSGKYYFYDNYFDLPGALLCAR 

VVDYLTKLNNGQKTFDFWKDIVAAIQHNYKMS 

AFKENCGIYFPEIKRDPGRYLHSRPESVKKWLRQ 

LKNAGKILLLITSSHSDYCRLLCA\YILGNDFTDLF 

DIVITNALKPGFFSHLPSQRPFRTLENDEEQEALP 

SLDKPGWYSQGNAVHLYELLKKMTGKPEPKVV 

YFGDSMHSDIFPARHYSNWETVLILEELRGDEGT 

RSQRPEESEPLEKKGKYEGPBCAKPLNTSSKKWGS 

FFMDSVLGLENTEDSLVYTWSCmSTYSTIAIPSI 

EAIAELPLDYKFTRFSSSNSKTAGYYPNPPLVLSS 
DETLISK 


2999 


A 


320 


2417 

1 

■ 

< 


LRRRKMTPQSLLQTILFLLSLLFLVQGAHGRGHR 

EDFRFCSQRNQTHRSSLHYKPTPDLRISIENSEEA 

LTVHAPFPAAHPASRSFPDPRGLYHFCLYWNRH 

AGRLHLLYGKRDFLLSDKASSLLCFQHQEESLAQ 

GPPLLATSVTSWWSPQNISLPSAASFTFSFHSPPH 

rGAHNASVDMCELKRDLQLLSQFLKHPQKASRR 

PSAAPASQQLQSLESKLTSVRFMGDMGSFEEDRI 

NATVWKLQPTAGLQDLHIHSRQEEEQSEIMEYS 

WLLPRTLFQRTKGRSGEAEKRLLLVDFSSQALFQ 

OKNSSQVLGEKVLGIWQNTKVANLTEPVVLTF 

3HQLQPKNVTLQCVFWVEDPTLSSPGHWSSAGC 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^^AIanine C=Cysteine, D=Aspartic Acid, 
E«=Glutamic Acid, F-Pbenylalanine, G^GIycine, H==Histidine, 
I=Isoleucine, K=Lysine,L«Leucine, M=Mettiionine, 
NsAsparagiDe,P^Proline, Q=Glntamine, R-Arginine, S^Serioe, 
T^^Threonine, V«Valine, W=Tryptophan, Y=Tyrosine, 
X^llnknown, ^^Stop codon, H^ossible nucleotide deletion, 
V=possible nucleotide insertion 










ETVRRETQTSCFCNHLTYFAVLMVSSVEVDAVH 

KHYLSLLSYVGCWSALACLVTIAAYLCSRVPLP 

CRRKPRDYTIKVHMNLLLAVFLLDTSFLLSEPVA 

LTGSEAGCRASAIFLHFSLLTCLSWMGLEGYNLY 

RLVVEVFGTYVPGYLLKLSAMGWGFPIFLVTLV 

ALVDVDNYGPnLAVHRTPEGVIYPSMCWIRDSL 

VSYITNLGLFSLVFLFhMAMLATKm^QILRLRPH 

TQKWSHVLTLLCLSLVLG\LPWALIFFSFASGTFQ 

LWLYLFSDTSFQGFLIFIWYWSMRLQARGGPSP 

LKSNSDSARLPISSGSTSSSRI 


3000 


A 


66 


1003 


SRGQLDAGQSSEQHGGNRQPEQSRSRSSSSSSSP 

RRSRSAAEPAMALSMPLNGLKEEDKEPLIELFVK 

AGSDGESIGNCPFSQRLFMILWLKGWFSVTTVD 

LKRKPADLQNLAPGTHPPFITFNSEVKTDVNKIEE 

FLEEVLCPPKYLKLSPKHPESNTAGMDIFAKFSA 

YIKNSRPEANEALBRGLLKTLQKLDEYLNSPLPD 

EIDENSMEDIKFSTRKFLDGNEMTLADCNLLPKL 

HIVKWAKKYRNFDIPKEMTGIWRyLTNAYSRD 

EFTNTCPSDKEVEIVAYSDVAKRLHQVKSRLLKE 

VSFMSSP 


3001 


A 


779 


2006 


LALTFRSALSTLPGSPMTSSGSPDLQLAWGPSLLP 

HPPSVWSPALPSCFAGPCPLLPLSDTQGWWGPN 

WLAPPSAALCRPDAAVWPDLPSSNILLVTPPPAK 

*SAVAV*PCPRGAHSLERAARQYTISGSSTSQSGK 

CSKRDTKCCAVTTSWGCFWQKHWKGDEDSGW 

AFQEGSHLGEGHL 


3002t^" 


A 


'909 -. 


'2799 


VEEAWTVWLHWGVREeLLEEETNQKEEAASSN 

WtKAfeGPFWQEDWVWDMRLKMTTRNF 

PCDVEVERFTREVPCLSSLGDGWDCENQEGHLR 

QSALTLEKPGTQEAICEYPGFGEHLIASSDLPPSQ 

RVLATNGFHAPDSNVSGLDCDPALPSYPKSYAD 

KRTGDSDACGKGFNHSMEVIHGRNPVREKPYKY 

PESVKSFNHFTSLGHQBOMKRGKKSYEGKNFENI 

FTLSSSLNENQRNLPGEKQYRCTECGKCFKRNSS 

LVLHIIRTHTGEKPYTCNECGKSFSKNYNLIVHQ 

RIHTGEKPYECSKCGKAFSDGSALTQHQRIHTGE 

KPYECLECGKTiFNRNSSLILHQRTHTGEKPYRCN 

ECGKPFTDISHLTVHLRIHTGEKPYECSKCGKAF 

RDGSYLTQHERTHTGEKPFECAECGKSFNRNSHL 

IVHQKIHSGEKPYECKECGKTFIESAYLIRHQRIH 

TGEKPYGCNQCQKLFRNIAGLIRHQRTHTGEKPY 

ECNQCGKAFRDSSCLTKHQRIHTKETPYQCPECG 

KSFKQNSHLAVHQRLHSREGPSRCPQCGKMFQK 

SSSLVRHQRAHLGEQPMET*WLGAT*VFQFTLTP 

VFRRRVLDLTPLWSVEKNPLSYPVN 


3003 


A. 


2 


1489 


SLTEHLSFFQPTAHSLTSLLGTMTTCSRQFTSSSS 

MKGSCGIGGGIGGGSSRISSVLAGGSCRAPSTYG 

GGLSVSSRFSSGGACGLGGGYGGGFSSSSSFGSG 

FGGGYGGGLGAGFGGGLGAGFGGGFAGGDGLL 

VGSEKVTMQNLNDRLASYLDKVRALEEANADL 

EVKIRDWYQRQRPSEIKDYSPYFKTIEDLRNKIIA 

ATIENAQPILQIDNARLAADDFRTKYEHELALRQ 

TVEADVNGLRRVLDELTLARTDLEMQIEGLBCEE 

LAYLRKNH*EEMLALRGQTGGEVNVETDAAPG 

VDLSCILNEMRNQYEQMAEKNRRDAETWFLSKT 
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SEQID 
NO: 


1 Method 


Predicted 

beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C-Cystcinc, U=Aspartic Acid, 

E^GIUtamic Acid V^PhpnvlaloninA tf^lviMixA U— rr2.I^j* 

-^xuwiuav nwiu, r rncnyiaianine, ij=a;riycine, ii=Histidine, 
I^lsoleudne, K=Lysine, Lr=Leucine, M=Methionine, 
N-Asparagine,P=Proline, Q=Glutamine, R=Argininc S=Serine, 
T^Threonine, V^Vallne, W^Tryptophan, Y=Ty rosine, 
X-Unknown, *-Stop codon, A-possible nucleotide deletion, 
V=possible nucleotide Insertion 










JSELNKEVASNSELVQSSRSKVIELRRVLQGLEIEL 
QSQLSMKASLENSLEETKQRYCMQLSQIQGLIGS 
VEEQLAQLRCEMEQQSQEYQILLDVKTRLEQEIA 
TYRRLLEGEDAHLSSQQASGQSYSSREVFTSSSSS 
SSRQTRPILKEQSSSSFSQGQSS 


3004 




2 


940 


GCAPDTRFFVPEPGGRGAAPWVALVARQGCTFK 

DKVLVAARRNASAWLYNEERYGNITLPMSHAG 

TGNTWIMISYPKGRElLELVQKGIPVlMnGVGT 

RHVQEHSGQSVVFVAIAFITMMnSLAWLIFYYlQ 

RFLYTGSQIGSQSHRKETKKVIGQLLLHTVKHGE 

KGroVDAENCAVCIENFKVKDIIRILPCKHIFHRIC 

IDPWLLDHRTCPMCKLDVIKALGYWGEPGDVQE 

MPAPESPPGRDPAANLSLALPDDDGSDESSPPSA 

SPAESEPQCDPSFKGDAGENTALLEAGRSDSRHG 
GPIS 


3005 


A 

■A 1 


184 


2552 

*-T • 


TMTfflQFLLLn.FWCa.PHFCSPEIMFRRTPVPQQ 
RILSSRVPRSDGKILHRQKRGWMWNQFFLLEEY 
TGSDYQYVGKLHSDQDKGDGSLKYILSGDGAGT 
LFDDEKTGDIHATRRIDREEKAFYTLRAQAINRR 
TLRPVEPESEFVIKIHDINDNEPTFPEEIYTASVPE 
MSWGTSWQVTATDADDPSYGNSARVIYSILQ 
GQPYFSVEPETGIIRTALPNMNRENREQYQVVIQ 
AKDMGGQMGGLSGTTnOSflTLTDVNDNPPRFPO 
NTIHLRVLESSPVGTAIGSVKAIDADTGKNAEVE 
YRIIDGDGTDMFDIVTEKDTQEGIITVKKPLDYES 
RRLYTLKVEAENTHVDPRFYYLGPFKDTTIVKISI 
vEPVDEPPVFSRSSYLFEVHEDIEVGtnGTVMARD 
PDSISSPIRFSLfaRHTDLDRIFNIHSGNGSLYTSKP 
LDRELSQWHNLTVIAAEINNPKETTRVAVFVRIL 
DANDNAPQFAVFYDTFVCENARPGQLIQTISAVD 
KDDPLGGQKEFFSLAAVNPNFTVQDNEDNTARJDL 
TRXNGFNRHEISTYLLPWISDNDYPIQSSTGTLTI 
RVCACDSQGNMQSCSAEALLLPAGLSTGALIAIL 
LCmLLVIVVLFAALKRQRKKEPLmSKEDIRDNIV 
SYNDEGGGEEDTQAFDIGTLRNPAAffiEKKLRRD 
nPETLFIPRRTPTAPDNTDVRDFINERLKEHDLDP 
TAPPYDSLATYAYEGNDSIAESLSSLESGTTEGD 
QNYDYLREWGPRFNKLPQKYGGGESDKDS 


J VvO 1 


A 1 




541 


QRVDKTWWGKSVGEVILTELEKALNSIIDVYHKY 
SLKGNFHAVYRDDLKKLLETECPQYIRKKGAD 
VWFKELDINTDGAVNFQEFLILVIKMGVAALNSII 
DVYHKYSLIKGNFHAVYRDDLQKLLETECPQYI 

RKKGADVWFKELDINTDGAVNFQEFLILVIKMG 
VGSPQKKVASYF 


3007 


A 


1 


1253 

] 
] 
] 
i 


MYEGIRCLLKALLGFVSLAIGTLYCPRQYRPFPG 

SLGffiAINVPEPIPDSYYRDMATWPTHAPSVEEG 

GQGRFGNQADHFLGSLAFAKLLNRSLAVPSWIB 

YQHHKPPFTNLHVSYQKYFKLEPLQAYHRVISLE 

DFMEKLAPTHWPPEKRVAYCFEVAAQRSPDKKT 

L/riviTiJivji^rr vjrr yvUKlrri V or NKSELrTGISFSA S 

YREQWSQRFSPKEHPVLALPGAPAQFPVLEEHRP 

LQKYMVWSDEMVKTGEAQIHAHLVRPYVGIHL 

RIGSDWKNACAMLBCDGTAGSHFMASPQCVGYS 

[ISTAAPLTMTMCXPDLKEIQRAVKLWVRSLDAQ 

5VYVATDSESYVPEL00LFKGKVKWSLKPEVA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to iast amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C=Cysteine, D=Aspartic Add, 
E=Glutamic Acid, F^Phenylalanine, G=Glycine, H^'Histidine, 
I"IsoIeudne, K°Lysine, L^Leadne, M=Methionine, 
N^Asparagine, P=Proline, Q=Giutamine, R=Arginine, S==Serine, 
T>Tbreonine, V=Valinc W^Tryptoplian, Y=Tyrosinc, 
X=Unknown, *»Stop codon, ^=possibIe nudeotide deletion, 
^possible nudeotide insertion 










QVDLmGQADHFIGNCVSSFTAFVKRERDLQGR 
PSSFFGMDRPPKLRDEF 


3008 


A 


3136 


1898 


TARGGGSEPGPTMAAWSSTSTIIREPIVKVKTSS 

QPGFLERLSETSGGMFVGLMAFLLSFYLIFTNEG 

RALKTATSLAEGLSLWSPDSIHSVAPENEGRLV 

HnGAUlTSKlI^DPNYGVHLPAVKl.RJRHV^^ 

QWVETEESREYIEDGQVKKETRYSYNTEWRSEII 

NSKNFDmGHKNPRAMAGESFMATAPFVQIGRF 

FLSSGLIDKVDNFKSLSLSKLEDPHVDIIRRGDFF 

YHSENPKYPEVGDLRVSFSYAGLSGDDPDLGPA 

HWTYIARQRGDQLVPFSIKSGDTLLLLHHGDFS 

AEEVFHRELRSNSMKTWGLRAAGWMAMFMGL 

NLMTRILYTLVDWFPVFRDLVNIGLKAFAFCVAT 

SLTLLTVAAGWLFYRPLWALLIAGLALVPILVAR 

TRVPAKKLE 


3009 


A 


93 


659 


DAAVAMTAQGGLVANRGRRFKWAIELSGPGGG 

SRGRSDRGSGQGDSLYPVGYLDKQVPDTSVQET 

DRILVEKRCWDL\LGPLKQIP1S^FIMYMAGNTI 

SIFPTMlvrVCMMAWRPIQALMAISATFKMLESSS 

QKFLQGLVYLIGNLMGLALAVYKCQSMGLLPTH 

ASDWLAFIEPPERMEFSGGGLLL 


3010 

^ ** ■ ,w 


A 


2 

. t ... 


1041 


LIDSAKARYWTQRGTWVYDNALLLLLKCLWSN 
VVPECTMASSNTVLMRLVASAYSIAQKAGMIVR 
RVIAEGDLGIVEKTCATDLQTKADRLAQMSICSS 
LARKFPKLTIIGEEDLPSEEVDQELIEDSQWEEILK 
QPCPSQYSAIKEEDLWWVDPLDGTKEYTEGLL 
^ DN VTVLIGIAYEGKAkG^^ 
LGRTnVGVLGLGAFGFQLKEVPAGKHnTTTRSH 
SNKLVTDCVAAMNPDAVLRVGGAGNKHQLIEG 
KASAYVFASPGCKKWDTCAPEVILHAVGGKLTD 
mCNVLQYHKDVKHMNSAGVLATLKNinDYYAS 
RVPESIKNALVP 


3011 


A 


291 


1452 


SPQKTMRSHmMTTTSVSSWPYSSHRMRFITNH 

SDQPPQNFSATP^fVTTCPMDEKLLSTVLTTSYSVI 

FIVGLVGMIALYVFLGIHRKRNSIQIYLLNVAIAD 

LLLIFCLPFRIMYHINQNKWTLGVILCKVVGTLFY 

MNMYISIILLGFISLDRYIKINRSIQQRKAITTKQSI 

YVCCIVWMLALGGFLTMIILTLKKGGHNSTMCF 

HYRDKImAKGEAIF^^^LVVMFWLIFLLIILSYIKI 

GKNLLRISKRRSKFPNSGKYATTARNSFIVmFTI 

CFWYHAFRFIYISSQLNVSSCYWKEIVHKTNEIM 

LVLSSFNSCLDPVMYFLMSSl^lRKIMCQLLFRRF 

QGEPSRSESTSEFKPGYSLHDTSVAVKIQSSSKST 


3012 


A 


246 


1346 


TEPVGYTKAEEPIAMRSLGALLLLLSACLAVSAG 

PVPTPPDNIQVQENFNISRIYGKWYNLAIGSTCPW 

LKKIMDRMTVSTLVLGEGATEAEISMTSTRWRK 

GVCEETSGAYEKTDTDGKFLYHKSKWNITMESY 

WHTNYDEYAIFLTKKFSRHHGPTITAKLYGRAP 

QLRETLLQDFRVVAQGVGIPEDSIFTMADRGECV 

PGEQEPEPELlPRVRRAVLPQEEEGSGGGQLVTEV 

TXKEDSCQLGYSAGPCMGMTSRYFYNGTSMAC 

ETFQYGGCMGNGNNFVTEKECLQTCRTVAACN 

LPIVRGPCRAFIQLWAFDAVKGKCVLFPYGGCQ 

GNGNKFYSEKECREYCGVPGDGDEELLRFSN 


3013 


A 


67 


379 


RQMALLKANKDLISAGLKEFSVLLNQQVFNDPL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


mdictcd end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C=Cystcinc D=Aspartic Add, " 

E^GIutamic Acid, F»PiienyIalanine G-Glvcini^ n-uictfiiitiA 
, » ■■•••■jMUMiitiict v> vxi/ciiiCf n~'fiisuoine) 

I^lsolencine, K^Lysine, L=Leucine, M=Methionine, 

N=Asparagine, P=Pronne, Q^GIutamlnc, R^Arginine, S-Serine. 

T=Threonine, V-Valinc. W=Tryptophan, Y-Tyrosine, 

X^UnknOWn. '^^Ston rnrinn i^iine»ikiA namjiiM^tj. J.I AS 

^^^"1' *^"on» /TiossiDie nucleotide deletion, 
V^possible nucleotide insertion 










VSEEDMVTVVEDWMNFYINYYRQQVTGEPOER 

DKALQELRQELNTLANPFLAKYRDFLKSHELPSH 
PPPSS 


3014 


A 


1 


373 


GTSWSTLRAVMSASVVSVVSKVLEEYLSSTPQRL 
KLLDAYLLYILLTGALQFGYCLFVLTFHFNSLLLF 
FFFCVGSFHSNVYFLLFTLSFLCFLFIAYFFLIRFFS 
LFIWFFHVFFIELSLFYF 


3015 


A 


2 


1321 


AAAEGTAPSPGRVSPPTPARGEPEVTVEIGETYLC 

RRPDSTWHSAEVIQSRVNDQEGREEFYVHYVGF 

NRRLDEWVDKNRLALTKTVKDAVQKNSEKYLS 

ELAEQPERKITRNQKRKHDEINHVQKTYAEMDP 

TTAALEKEHEAITKVKYVDKIHIGNYEroAWYFS 

PFPEDYGKQPKLWLCEYCLKYMKYEKSYRFHLG 

QCQWRQPPGKEIYRKSNISVYEVDGKDHKIYCQ 

in.CLLAKLFLDHKTLYFDVEPFVFYILTEVDRQG 

AHTVGYFSKEKESPDGNNVACILTLPPYQRRGYG 

KFLIAFSYELSKLESTVGSPEJCPLSDLGKLSYRSY 

WSWVLLELLRDFRGTLSIKDLSQMTSITQNDHST 

LQSLNMVKYWKGQHVICVTPKLVEEHLKSAQY 

KKPPITGGWGAAVCRGRWGSVSIWTCRSQGLLI 


3016 


A 


2 


1321 


AAAEGTAPSPGRVSPPTPARGEPEVTVEIGETYLC 

RRPDSTWHSAEVIQSRVNDQEGREEFYVHYVGF 

NRRLDEWVDKNRLALTKTVKDAVQKNSEKYLS 

ELAEQPERKITRNQKRKHDEINHVQKTYAEMDP 

TTAALEKEHEAITKVKYVDKmiGNYEIDAWYFS 

PFPEDYGKQPKLWLCEYCLKYMKYEKSYRFHLG • 

QCQWRQPPGKEIYRKSNISVYEVDGKDHKIYCQ 

NLCLLAKLFLDHKTLYFDVEPFVFYILTEVDRQG 

AHIVGYFSKEKESPDGNNVACIL1LPPYQRRGYG 

KFLIAFSYELSKLESTVGSPEKPLSDLGKLSYRSY 

WSWVLLEILRDFRGTLSlKDLSQMTSrrQNDnST 

LQSLNMVKYWKGQHVTCVTPKLVEEHLKSAQY 

KKPPITGGWGAAVCRGRWGSVSIWTGRSQGLLI 


3017 


A 


38 


704 


EAHPGGQLGSERNGVRMDEDVLTTLKILnGESG 

VGKSSLLLRFTDDTFDPELAATIGVDFKVKTISVD 

GNKAKLAIWDTAGQERFRTLTPSYYRGAQGVIL 

VYDVTRRDTFVKLDNWLNELETYCTRNDIVNM 

LVGNKIDKENREVDRNEGLKFARKHSMLFIEAS 

AKTCDGVQCAFEELVEKnQlKJLWESENQNKG 

VKLSHREEGQGGGACGGYCSVL 


3018 


A 


2640 


2861 


APVLILQMVKLSIVLIPQFLSHDQGQLTOELQQH 

VKSVTCPCEYLRKVSECRQMGPGALEQFPGLSC 
HTSHSG 


3019 


A 


1307 


71 1 

] 


PGITMAASLVUKKIVFVTGl^AKKLEEVVQILGDK 

FPCTLVAQKIDLPEYQGEPDEISIQKCQEAVRQV 

QGPVLVEDTCLCFNALGGLPGPYIKWFLEKLKPE 

GLHQLLAGFEDKSAYALCTFALSTGDPSQPVRLF 

RGRTSGRIVAPRGCODFGWDPCFOPDGYEOTYA 

BMPKAEKNAVSHRFRALLELQEYFGSLAA 


3020 




1202 


180 

] 
] 
( 


VSCLPTSCKMITLNNQDQPVPFNSSHPDEYKIAA " 
LVFYSCIFnGLFmTALWWSCTTKKRTTVTIYM 
VINVALVDLIFIMTLPFRMFYYAKDEWPFGEYFC 
3ILGALTVFYPSIALWLLAFISADRYMAIVQPKY 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=rAlanine C=Cysteine, I>=Aspartic Acid, 
£==Glutaraic Acid, F»Pbenyialanine, G^lycine, HsHistidioe, 
Msoleucine, K-Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P==ProIine, Q^Glutamine, R=Arginine, S=Serine, 
T^Tbreonine, V=Valioc, W=Tryptophan, Y=Tyrosine, 
X=Unknawn, *aStop codon, Aspossible nucleotide deletion, 
\=possible nucleotide insertion 










AKELKNTCKAVLACVGVWIMTLTriTPLLLLYK 

DPDKDSTPATCLKJSDHYLKAVNVLNLTRLTFPF 

LIPLFIMIGCYLVIIHNLLHGRTSKLKPKVKEKSIRl 

niLLVQVLVCFMPFHICFAFLMLGTGENSYNPW 

GAFTTFLMNl^TCLDmVYIVSKQFQARVISVM 

LYK^m.RSMRRKSFRSGSLRSLSNINSEML 


3021 


A 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVPPKKDKLQT 
KMCKPRRYWEEETVPTTAGASPGPPRNKKNREL 
RPQRPKNAmKKSRISKKPQVPKKPREWKNPES 
QRGLSGAQDPFPGPAPVPVEVVQKFCRIDKSRKL 
PHSKAKTRSRLEVAEAEEEETSIKAARSELLLAEE 
PGFLEGEDGEDTAKICQADIVEAVDIASAAKHFD 
LlNfLRQFGPYRLNYSRTGRHLAFGGRRGHVAALD 
WVTKKLMCEINVMEAVRDIRFLHSEALLAVAQN 
. »RWLHIYDNQGIELHCIRRCDRVTRLEFLPFHFLLA 
TASETGFLTYLDVSVGiaVAALNARAGRLDVMS . 
QNPYNAVIHLGHSNGTVSLWSPAMKEPLAKILC 
HRGGVRAVAVDSTGTYMATSGLDHQLKIFDLRG 
TYQPLSTRTLPHGAGHLAFSQRGLLVAGMGDW 
NIWAGQGKASPPSLEQPYLTHRLSGPVHGLQFCP 
FEDVLGVGHTGGITSMLVPGAGEPNFDGLESNPY 
RSRKQRQEWEVKALLEKVPAELICLDPRALAEV 
DVISLEQGKKEQIERLGYDPQAKAPFQPKPKQKG 
RSSTASLVKRKRKVMDEEHRDKVRQSLQQQHH 
KEAKAKPTGARPSALDRFVR 


3022 


A 


1 


2249 


MTAQPSNTSAHAQRDGPELPASSSWRSFWPLSC 

LSSPPySAVEVATEGRDREVAKVGQR^ 

LRQAIU3RDCCVRMPAPVGRRSPPSPRSSMAAVA 

LRDSAQGMTFEDVAIYFSQEEWELLDESQRFLYC 

DVMLENFAHVTSLGYCHGMENEAIASEQSVSIQ 

VRTSKGNTPTQKTHLSEIKMCVPVLKDILPAAEH 

QTTSPVQKSYLGSTSMRGFCFSADLHQHQKHYN 

EEEPWKRKVDEATFVTGCRFHVL>JYFTCGEAFP 

APTDLLQHEATPSGEEPHSSSSKfflQAFFNAKSYY 

KWGEYRKASSHKHTLVQHQSVCSEGGLYECSK 

CEKAFTCKNTLVQHQQIHTGQKMFECSECEESFS 

KKCHLILHKJIHTGERPYECSDREKAFIHKSEFIHH 

QRRHTGGVRHECGECRKTFSYKSNLffiHQRVHT 

GERPYECGECGKSFRQSSSLFRHQRVHSGERPYQ 

CCECGKSFRQIFNLIRHRRVHTGEMPYQCSDCGK 

SFSCKSELIQHQRIHSGERPYECRECGKSFRQFSN 

LIRHRSIHTGDRPYECSECEKSFSRKFILIQHQRVH 

TGERPYECSECGKSFTRKSDLIQHRRIHTGTRPYE 

GSECGKSFRQRSGLIQHRRLHTGERPYECSECGK 

SFSQSASLIQHQRVHTGERPYQCCECGKSFRQIFN 

LIRHRRVHTGEMPYQCSDCGKSFSCKSELIQHRRI 

HSGERPYECSECGKSFSRKShnLIRHRRVHTEERP 


3023 


A 


3148 


634 


AAGALRCLAAFPRAEPASRGRQSSPARACAASR 

AERATAAAMAHRCLRLWGRGGCWPRGLQQLL 

VPGGVGPGEQPCLRTLYRFVTTQARASRNSLLTD 

nAAYQRFCSRPPKGFGKYFPNGKNGKKASEPKE 

WGEKKESKPAATTRSSGGGGGGGGKRGGKKD 

DSHWWSRFQKGDIPWDDKDFRMFFLWTALFWG 

GVMFYLLLKRSGREITWKDFVNNYLSKGVVDRL 

EVVNKRFVRVTFTPGKTPVDGQYVWFNIGSVDT 
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SEQID 
NO: 



3024 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to Grst smitto 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



A [274 



1455 



Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Add, 
E^GIutamic Acid, F=Phcnylalanine. G^dycinc, H=Histidine, 
I=Isoleuclne, K^Lysine, LsLeucine, M=Methionine, 
N=:=Asparagine, P^Proline, Q=GIutamine, R=Arginine, S-Serlne, 
T^ThrconinCi V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *«iStop codon, A=possibIc nucleotide deletion, 
\=possible nucleotide insertion 



FERNLbALQQELGIEGENRVPVVYlAESDGSFLLS 

MLPTVLIIAFLLYTIRRGPAAIGRTGRGMGGLFSV 

GETTAKVLKDEIDVKFKDVAGCEEAKLEIMEFV 

NFLKNPKQYQDLGAKIPKGAILTGPPGTGKTLLA 

KATAGEANVPFITVSGSEFLEMFVGVGPARVRDL 

FALARKNAPCILFIDEIDAVGRKRGRGNFGGQSE 

QENTLNQIXVEMDGFNTTTNVmAGTNRPDim 

PALLRPGRFDRQIHGPPDIKGRASIFKVHLRPLKL 

DSTLEKDKLARKLASLTPGFSGADVANVCNEAA 

LIAARHLSDSINQKHFEQAIERVIGGLEKKTQVLQ 

PEEKKTVAYHEAGHAVAGWYLEHADPLLKVSn 

PRGKGLGYAQYLPKEQYLYTKEQLLDRMCMTL 

GGRVSEEIFFGRITTGAQDDLRKVTQSAYAQIVQ 

FGMNEKVGQISFDLPRQGDMVLEKPYSEATARLI 

DDEVRILINDAYKRTVALLTEKKADVEKVALLL 

LEKEVLDKNDMVELLGPRPFAEKSTYEEFVEGT 

GSLDEDTSLPEGLKDWNKEREKEKEEPPGEKVA 
N 



LRACJS1.FSMSALEKSMHLGRLPSRPPLPGSGGSQ 

SGAKMRMGPGRKRDFSPVPWSQYFESMEDVEV 

ENETGKDTFRVYKSGSEGPVLLLLHGGGHSALS 

WAVFTAAHSRVQCRIVALDLRSHGETKVKNPED 

LSAETMAKDVGNVVEAMYGDLPPPIMLIGHSMG 

GAIAVHTASSNLVPSLLGLCMIDWEGTAMDAL 

NSMQNFLRGRPKTFKSLENAffiWSVKSGQIKNLE 

SARVSMVGQVKQCEGrrSPEGSKSIVEGIIEEEEE 

deegsesiskm:keddmetkkdhpytwrielakt 

EKYWDGWFRGLSNLFLSCPIPKLLLLAGVDRLD 
KDLTIGQMQGKFQMQVLPQCGHAVHEDAPDKV 
AEAVATFLIRHRFAEPIGGFQCVFPGC 



621 



306 



YHGGQRGRAGGSFRSVQGWGGQLRNPFRTSKSL 

SWKGLSSLLFPLYNLQMGRPRDRKELGRGHSPP 

HLEGPHMLPSGAARWRWLEAPVLVLEPLVLRPA 
AAPTP 



1533 



454 



3027 



A [179 



703 



AKVPQSTREEKRENGLEAKSPAINLMGFNVEEM 

YEAHAMQRILSLQNHHIIENNHILYLGRKEHDIL 

SQLQKTSSVSITEnSPGRTELEIEGARADLIEWM 

NIEDMLCKVQEEMARKKERGLWRSLGQWTIQQ 

QKTQDEMKENIIFLKCPVPPTQELLDQKKQFEKC 

GLQVLK\^KIDNEVLMAAFQRKKK]yOVlEEK^ 

QPVSHRLFQQVPYQFCNVVCRVGFQRMYSTPCD 

PKYGAGIYFTKNLKNLAEKAKKISAADKLIYVFE 

AEVLTGFFCQGHPLNIVPPPLSPGAIDGHDSVVD 

NVSSPETFVIFSGMQAIPQYLWTCTQEYVQSQDY 

SSGPMRPFAQHPWRGFASGSPVD 



PFHLGAStiNrfRLQVQTQESKAQKEVKMGFIFSK 

SMNESMKNQKEFMLMNARLQLERQLIMQSEMR 

ERQMAMQIAWSREFLKYFGTFFGLAAISLTAGAI 

KKKKPAFLVPIVPLSFILTYQYDLGYGTLLERMK 

GEAEDILETEKSKLQLPRGMITFESIEKARKEOSR 
FFIDK 



3028 A TstT 



1226 



AVGKiiJt'li^{:syiWVRDREGmRSRRSMKMLWKLT 

DNIKYEDCEVSATPARSSVRSQAPSLTLPLLLLSL 

QPAAKRGWDKLSPAQRPSLGFARRTRGRSCRER 
TWMLPSLVSEFLHRD 



233 



wo 01/57190 



PCT/USOl/04098 



SEQID 
NO; 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E^GIutamic Acid, F«Phenylalanine, G'^GIycine, H^Histidine, 
I=Isoleucine, K^Lysine, L^Lencine, M^Methionine, 
NeAsparagine, P=Proline, Q=GIutamine, R^Arginine, S=Serine, 
T=Threonine, V^Valinc, W=Tryptophan, Y^Tyrosinc, 
X»Unknown, *e5top €odoD,/=possible nacleotide deletion, 
\=possible nucleotide insertion 


3029 


A 


3 


1731 


FREGRFGSSCAVAAPLAGFQGLIECGYLAVDSPP 

SCWTPGGSNPAAPLPQALLPPRLPPTVLPFLGPGL 

SGELEMFTLPQKDFRAPTTCLGPTCMQDLGSSHG 

EDLEGECSRKLDQKLPELRGVGDPAMISSNTSYL 

SSRGRMIKWFWDSAEEGYRTYHMDEYDEDKNP 

SGUNLGTSENKLCFDLLSWRLSQRDMQRVEPSL 

LQYADWRGHLFLREEVAKFLSFYCKSPVPLRPE 

NVVVLNGGASLFSALATVLCEAGEAFLIPTPYYG 

AITQHVCLYGNIRLAYVYLDSEVTGLDTRPFQLT 

VEKLEMALREAHSEGVKVKGLILISPQNPLGDVY 

SPEELQEYLVFAKRHRLHVIVDEVYMLSVFEKSV 

GYRSVLSLERLPDPQRTHVMWATSKDFGMSGLR 

FGTLYTENQDVATAVASLCRYHGLSGLVQYQM 

AQLLRDRDWINQVYLPENHARLKAAHTYVSEEL 

RALGIPFLSRGAGFFIWVDLRKYLLKGTFEEEML 

LWRRFLDNKYLLSFGKAFECKEPGWFRFVFSDQ 

VHRLCLGMQRVQQVLAGKSQVAEDPRPSQSQEP 

SDQRR 


3030 


A 


1 


584 


PWLPWSDGRAARSSRKCPRSRFPVQVGKMAVST 

VFSTSSLMLALSRHSLLSPLLSVTSFRRFYRGDSP 

TDSQKDMIEIPLPPWQERTDESIETKR\RLLYESR 

KRGMLENCILLSLFAKEHLQHMTEKQLNLYDRLI 

NEPSNDWDIYYWATEAKPAPEIFENEVl^LLRD 

FAKNKNKEQRLRAPDLEYLFEKPR 


3031 


A 


1177 


359 


SLWPWILMDDSLMQISLQLLCVYTANFPNGCSSL 
^CWSSCGQHPVQATHRGAVSNSLMLCn.KLASQM 
^PLENITVQQMWMLLSNLALSHDCKGVIQKSNF 
LQNiFLSLALPKGGNKHLSl^TIL>^a.KLLL^ 
DGQQMILRLDGCLDLLTEMSKYKHKSSPLLPLLI 
FHNVCFSPANKPKILANEKVITVLAACLESENQN 
AQRIGAAALWALIYNYQKAKTALKSPSVKRRVD 
EAYSLAKKTFPNSEANPLNAYYLKCLENLVQLL 
NSS 


3032 


A 


2 


1242 


GISGRPPRPAKRRMGKNPVRPPRALPPVPSQDDIP 

LSRPKKKKPRTKNTPASASLEGLAQTAGRRPSEG 

NEPSTKELKEHPEAPVQRRQKKTRLPLELETSST 

QKKSSSSSLLRNENGIDAEPAEEAVIQKPRRKTK 

KTQPAELQYANELGVEDEDnTDEQTTVEQQSVF 

TAPTGISQPVGKVFVEKSRRFQAADRSELIKTTEN 

IDVSMDVKPSWTTRDVALTVHRAFRMIGLFSHG 

FLAGCAVWNIVVIYVLAGDQLSNLSNLLQQYKT 

LAYPFQSLLYLLLALSTISAFDRIDFAKISVArRNF 

LALDPTALASFLYFTALILSLSQQMTSDRIHLYTP 

SSVNGSLWEAGIEEQILQPWIVVNLVVALLVGLS 

WLFLSYRPGMDLSEELMFSSEVEEYPDKEKEEKA 

SS 


3033 


A 


3 


1436 


TATSGGIWLRRKWRCHWPRPLPQSCVGTEGGLQ 

VRDTSSRIAKGGVDHTKMSLHGASGGHERSRDR 

RRSSDRSRDSSHERTESQLTPCIRNVTSPTRQHHV 

EREKDHSSSRPSSPRPQKASPNGSISSAGNSSRNS 

SQSSSDGSCKTAGEMVFVYENAKEGARMRTSER 

VTLIVDNTRFVVDPSIFTAQPNTMLGRMFGSGRE 

HNFIRPNEKGEYEVAEGIGSTVFRAILDYYKTGn 

RCPDGISIPELREACDYLCISFEYSTIKCRDLSALM 

HELSNDGARRQFEFYLEEMILPLMVASAQSGERE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A-Alanine C=Cysteine, D=Aspartic Acid, " 
Ki— wiuuiiuit. /%uuy r r JicnyiaianinC) vi— OiyciDC, ii=HiStldinc, 
I»Isolencine»K=Lysine, L^Leucine, M^Methionine, 
N=Asparagine, P=ProIine, Q=GIutaniine, R=Argininc S=Serine, 
T=Threonine, V=VaIinc W-Tryptophan, Y=Tyrosine, 
X»llDlinown, *=Stop codon, /possible nucleotide deletion, 
\-possible nucleotide insertion 










CHIWLTDDDWDWDEEYPPQMGEEYSQnYSTK 

LYRFFKYIENRDVAKSVLKERGLKKIRLGIEGyP 

TYKEKVKKRPGGRPEVIYNYVQRPFIRMSWEKE 

EGKSRHVDFQCVKSKSITNLAAAAADIPQDQLV 

VMHPTPQVDELDILPIHPPSGNSDLDPDAONPML 


3034 


A 


3 


1972 


SSLAQHRSVAVLGWPAGWAAARARPAMQGGN 

SGVRKREEEGDGAGAVAAPPAIDFPAEGPDPEY 

DESDVPAEIQVLKEPLQQPTFPFAVANQLLLVSL 

LEHLSHVHEPNPLRSRQVFKLLCQTFIKMGLLSSF 

TCSDEFSSLRLHHl^ITHLMRSAKERVRQDPCE 

DISRIQKIRSREVALEAQTSRYLNEFEELAILGKG 

GYGRVYKVRNKLDGQYYAIKKILIKGATKTVCM 

KVLREVKVLAGLQHPNIVGYHTAWIEHVHVIQP 

RADRAAIELPSLEVLSDQEEDREQCGVKNDESSS 

SSIIFAEPTPEKEKRFGESDTENQIWKSVKYTTNL 

VIRESGELESTLELQENGLAGLSASSIVEQQLPLR 

RNSHLEESFTSTEESSEENVNFLGQTEAQYHLML 

HIQMQLCELSLWDWIVERNKRGREYVDESACPY 

VMANVATKIFQELVEGVFYIHNMGIVHRDLKPR 

NIFLHGPDQQVKIGDFGLACTDILQKNTDWTNR 

NGKRTPTHTSRVGTCLYASPEQLEGSEYDAKSD 

MYSLGWLLELFQPFGTEMERAEVLTGLRTGQL 

PESLRKRCPVQAKYIQHLTRRNSSQRPSAIQLLQS 

ELFQNSGNVNLTLQMKIIEQEKEIAELKKQLNLL 

SQDKGVRDDGKDGGVG 


.3035 


A 


110 


1172 


KLSCPCSHGTRVTAVRGPRLKAGVQWHDLGSLQ 

PPPSGLKQSSHLSLSSSWDFRHAPTHPETYTCPK 

MIEMEQAEAQLAELDLLASMFPGENELIVNDQL 

AVAELKDCIEKKTMEGRSSKVYFTINMNLDVSD 

EKMAMFSLACILPFKYPAVLPEITVRSVLLSRSQQ 

TQLNTDLTAFLQKHCHGDVCILNATEWVREHAS 

GYVSRDTSSSPTTGSTVQSVDLBFTRLWIYSHHIY 

NKCKRKNILEWAKELSLSGFSMPGKPGVVCVEG 

PQSACEEFWARLRKLNWKRILIRHREDIPFDGTN 

DETERQRKFSIFEEKVFSVNGARGNHMDFGQLY 

QFLNTKGCGDVFQMFLWV 


3036 


A 


1 


2288 


FRFAERRAAAAESDVSAKMAGRSMQAARCPTD 

ELSLTNCAWNEKDFQSGQHVIVRTSPNHRYTFT 

LKTHPSWPGSIAFSLPQRKWAGLSIGQEIEVSLY 

TFDKAKQCIGTMTIEIDFLQKKSIDSNPYDTDKM 

AAEFIQQFNNQAFSVGQQLVFSFNEKLFGLLVKD 

lEAMDPSILNGEPATGKRQKIEVGLWGNSQVAF 

EKAENSSLNLIGKAKTKENRQSIINPDWNFEKMG 

IGGLDKEFSDIFRRAFASRVFPPEIVEQMGCKHVK 

GILLYGPPGCGKTLLARQIGKMLNAREPKVVNG 

PEILNKYVGESEANIRKLFADAEEEQRRLGANSG 

LHniFDEIDAICKQRGSMAGSTGVHDTWNQLLS 

KIDGVEQLNNILVIGMTNRPDLIDEALLRPGRLEV 

KMEIGLPDEKGRLQILHIHTARMRGHQLLSADV 

UJLKJj^LA VJb 1 KNFSGAELEGLVRAAQSTAMNRHI 

KASTKVEVDMEKAESLQVTRGDFLASLENDIKP 

AFGTNQEDYASYIMNGnKWGDPVTRVLDDGEL 

LVQQTKNSDRTPLVSVLLEGPPHSGKTALAAKIA 

EESNFPFIKICSPDKMIGFSETAKCQAMKKIFDDA 

VnfCSQLSCWXn^DIERLLDYVPIGPRFSNLVLOAL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nudeotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AJanine OCysteine, D=<Aspartic Acid, 
£=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleudne, K=Lysine, L^Leucine, M^Methionine, 
N°Asparagine, P^ProKne, (^Giutamine, R-Arginine, S^Serine, 
TsThreonine, V=VaIinc, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=po$sible nudeotide deletion, 
V^ossible nudeotide insertion 










LVLLKKAPPQGRKLLUGTTSRKDVLQEMEMLNA 
FSTTIHWNIATGEQLLEALELLGNFKDKERTTIA 
QQVKGKKVWIGIKKLLMLIEMSLQMDPEYRVRK 
FLALLREEGASPLDFD 


3037 . 


A 


1 


1347 


MLDTGSEHLNRILKALPALQSAGSEGQNGSAESL 

GEGGTWDSDRAIUaaRGGl^EIPTFYPCLVVRSP 

VTASDLRGTQDFAAYHGLSLILEPLGACNRLSVC 

VPVHSPPGMRVSPRSPSLRTLVIDPAEPAGAQRL 

RFSGKERSGEAGSAVEGLAVAVSMGDGGAERD 

RGPARRAESGGGGGRCGDRSGAGDLRADGGGH 

SPTEVAGTSASSPAGSRESGADSDGQPGPGEADH 

CRRILVRDAKGTIREIVLPKGLDLDRPKRTRTFTT 

AEQLYRLEMEFQRCQYWGRERTELARQLNLSE 

TQVKVWFQNRRTKQKKDQSRDLEKRASSSASEA 

FATSNILRLLEQGRLLSVPRAPSLLALTPSLPGLP 

ASHRGTSLGDPRNSSPRLNPLSSASASPPLPPPLP 

AVCFSSAPLLDLPAGYELGSSAFEPYSWLERKVG 

SASSCKKANT 


3038 


A 


924 


501 


TELLPLCSRSGPKPQSGDPLLQLAQQARPRLSGE 

RLETAPSLLLSRMACVISGWALSRGARTWTWAT 

PTGPVHRAQPAIRSLSAEGALTRLKEEKWPGRYI 

LPNHLTPPFLYKHLGSVPPSHWRSPLISHSVNILA 

LNWR 


3039 


A 


1263 


111 


ACGIRHEGALPGLTATPEAMLRFLPDLAFSFLLIL 

ALGQAVQFQEYVFLQFLGLDKAPSPQKFQPVPYI 

LKKIFQDREAAATTGySRDLCYVKELGVRGNVL 

RfLPDQG#LYPKKISQASSCtQ^^ 

REQLiX AQLGLDLGPNS YYNLGPELELALF 

PHVWGQTTPKPGKMFVLRSVPWPQGAVHFNLL 

DVAKDWNDNPRKNFGLFLEILVKEDRDSGVNFQ 

PEDTCARLRCSLHASLLVVTLNPDQCHPSRKRRA 

AIPWKLSCKNLCHRHQLFINFRDLGWHKWIIAP 

KGFMAOTCHGECPFSLTISLNSS^^yAFMQALMH 

AVDPEIPQAVCIPTKLSPISMLYQDNNDNVILRHY 

EDMWDECGCG 


3040 


A 


15 


849 


ASRLPRGPGCGADMRPLLGLLL VFAGCTFALYL 

LSTRLPRGRRLGSTEEAGGRSLWFPSDLAELREL 

SEVLREYRKEHQAYVFLLFCGAYLYKQGFAIPGS 

SFLNVLAGALFGPWLGLLLCCVLTSVGATCCYL 

LSSIFGKQLVVSYFPDKVALLQRKVEENRNSLFF 

FLLFLRLFPMTPNWFLNLSAPILNIPIVQFFFSVLI 

GLIPYNnCVQTGSILSTLTSLDALFSWDTVFKLL 

AIAMVALIPGTLIKKFSQKHLQLNETSTANHIHSR 

KDT 


3041 


A 


1015 


175 


GLKRRKLCFAKVGDVLGCLSLPPSRSARVLEDISI 

LSCISVDSRIVR'IKVPCSVTMSRPRKRLAGTSGSD 

KGLSGKRTKTENSGEALAKVEDSNPQKTSATKN 

CLKNLSSHWLMKSEPESRLEKG VDVKFSffiDLKA 

QPKQTTCWDGVR>ri^QARNFLRAMKLGEEAFFY 

HSNCKEPGIAGLMKIVKEAYPDHTQFEKNNPHY 

DPSSKEDNPKWSMVDVQFVRMMKRFIPLAELKS 

YHQAHKATGGPLKNMVLFTRQRLSIQPLTQEEF 

DFVLSLEEKEPS 


3042 


A 


1015 . 


175 


GLKRRRLCFAKVGDVLGCLSLPPSRSARVLEDISI 
LSCISVDSRIVRTKVPCSVTMSRPRKRLAGTSGSD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


1 Predicted end 

1 IIUL.ICUUUC 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Ammo acid sequence (A=Aianine 0=Cysteine, D=Aspartic Acid, 
E^lutamic Acid, F^^Phenylalanine, G-Glycine, H=Histidine, 
Wsoieucinc, K=Lysine, D=Leudne, M-Methionlne, 
N=Asparaginc, P=ProIine, Q=Glutamlnc R-Argininc, S=Serine, 
T-Tlirconine, V=VaIine, W=Tryptophan, Y^Tyrosinc, 
X-Unknown, *=Stop codon, A=possjblc nudeotide ddetion, 
\=po5sible nucleotide insertion 










KGLSGKRTKTENSGEALAKVEDSNPQKTSATKN " 

CLKNLSSHWLMKSEPESRLEKGVDVKFSIEDLKA 

QPKQTTCWDGVIWYQAimFLRAMKLGEEAFFY 

HSNCKEPGIAGLMKIVKEAYPDHTQFEKNNPHY 

DPSSKEDM>KWSMVDVQFVRMMKRPIPLAELKS 

YHQAHKATGGPIJKNMVLFrRQRLSIQPLTOEEF 
DFVLSLEEKEPS 


3043 


A 


153 


1133 


VGTAPAPGGRDRAPAMGSFQLEDFAAGWIGGA 

ASVrVGHPLDTVKTRLQAGVGYGNTLSCIRVVY 

RRESMFGFFKGMSFPLASIAVYNSWFGVFSNTQ 

RFLSQHRCGEPEASPPRTLSDLLLASMVAGWSV 

GLGGPVDLIKIRLQMQTQPFRDANLGLKSRAVAP 

AEQPAYQGPVHCITTIVRNEGLAGLYRGASAML 

LRDVPGYCLYFIPYVFLSEWITPEACTGPSPCAV 

WLAGGMAGAISWGTATPJVIDWKSRLQADGVY 

LNKYKGVLDCISQSYQKEGLKVFERGITVNAVR 

GFPMSAAMFLGYELSLQAIRGDHAVTSP 


3044 


A 


41 


1316 


PPLGAGAGIHARSPHPARRLiaTAAGVGGRASG " 

LLPTPWRRHHGPSGAAPYPAARLWQGPWRCRR 

PQPMAQRYDELPHYPGIADGPAALAGFPEAVPA 

APGPYGPHRPPQPLPPGLDSDGLKRDKDEIYGHP 

LFPLLALGFEKCELATCSPRDGAGAGLGTPRGGD 

VCSSDSFNEDNTAFAKQVCSERPFSSNPELDNLM 

IQAIQVLRFHLLELEKGKMPIDLVIEDRDGGCRE 

DFEDYPAPCPSLPDQNNIWIRDHEDSGSVHLGTP 

GPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGE 

DEDLDQEPRRNKKRGIFPKVAlT^lRAWLFQHL 

SHPYPSEEQKKQLAQDTGLTDLQVNNWFINARRR 

IVQPMIDQSNRTGQGAAFSPEGQPIGGYTETEPH 

VAFRAPASVGMSLNSEGEWHYL 


3045 


A 


3 


967 


VAHTQWHTCQRLSQLTHRSILKYLLIDTHACQV 

LILKHTHASLSLPSCQECFPSSIPSASHMVSHPHPP 

PSPRWGQTPEGLPAASPCGPGPRSCFSSILPTGDS 

WGMLACLCTVLWHLPAVPALNRTGDPGPGPSIO 

KTYDLTRYLEHQLRSLAGTYLNYLGPPFNEPDFN 

PPRLGAETLPRATVDLEVWRSLNDKLRLTQNYE 

AYSHLLCYLRGLNRQAATAELRRSLAHFCTSLQ 

GLLGSIAGVMAALGYPLPQPLPGTEPTWTPGPAH 

SDFLQKMDDFWLLKELQTWLWRSAKDFNRLKK 
KMQPPAAAVTLHLGAHGF 


3046 


A 


1185 


1584 " 


MYAYMYICTmcICAYRGIHlDVYLYMCIYIHIWI 
HTYLCVHTYVYVYICnnCMCIHTYVYVYTYMY 
VYTYICLCVYICU:VHIYLCVYIHN1YMCTH1CMC 
IHTYVHMCICVYIHMYTCVYVYTYTCVYMY 


3047 


A 


811 


132 

] 
] 


SLDLLGPIGDLQEGRDPGTQGPQEKEKQMPASPM 

NTDAHLDINFKEGLKKERSYTGQFEANVRDEER 

QCGCGWPDSLLMKVLSQRLDQQDCIQKGWVL 

HGVPRDLDQAHLLNRLGYNPNREFFLNVPFDSI 

MERLTLRRIDPVTGERYHLMYKPPPTMEIQARLL 

l^NPKDAEEQVKLKMDLFYRNSADLEQLYGSAIT 

LNGDQDPYTVFEYIESGIINPLPKKIP 


3048 i 




2 


1166 J 

I 
1 


RPRRGQGLVQEVQTENVTVAEGGVAEITCRLHQ 
IfDGSrVVIQNPARQTLFFNGTRALKDERFQLEEFS 
'RRVRIRLSDARLEDEGGYFCQLYTEDTHHQIAT 
.TVLVAPENPWEVREQAVEGGEVELSCLVPRSR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
seqocnce 


Amino acid sequence (A=Alanine C<=Cysteine, D=Aspartic Acid, 
E=G!ntamic Acid, F=PhenylaIanine, G^GIycine, H'^Histidine, 
I=Isoleucine, K^Lysine, L^Leucine, M=Methionine, 
N=»Asparagine, P=Proline, Q=Glutamine, R=Arginine, S==Serine, 
T«Threonlnc, V=VaIine, W=Tryptophan, Y-Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\ppossible nucleotide insertion 










PAATLRWYRDRKELKGVSSSQENGKVWSVAST 
VRFRVDRKDDGGinCEAQNQALPSGHSKQTQYV 

ldvqysptarihasqawregdtlvltcavtgn 

prpnqirwnrgneslperaeavgetltlpglvsa 

dngtytceasnkhgharalyvlwygesrlrpt 

eggggapdpgaweaqtsvpyaivggilallvfl 

hcvlvgmvwcsvrqkgsyliheasgldeqgea 

reaflngsdghkrkeeffi 


3049 


A 


3159 


882 


vgctlrvgvmaaagsrkrrlaeltvdeflasgf 

dsesesesenspqaetreareaarspdkpggspsa 

srrkgrasehkdqlsrlkdrdpefykflqendq 

sllnfsdsdsseeeegpfhslpdvleeaseeedga 

eegedgdrvprglkgkknsvpvtvamverwkq 

aakqrltpklfhevvqafraavattrgdqesae 

ankfqvtdsaafnalvtfcirdligclqkllfgk 

vakdssrmlqpsssplwgklrvdikaylgsaiql 

vsclsettvlaavlrfflsvlvpcfltfpkqcrml 

LKmiVVVWSTGEESLRVLAFLVLSRVOlHKKDT 
FLGPVLKQMYITYVRNCKFTSPGALPFISFMQWT 
LTELLALEPGVAYQHAFLYIRQLAIHLRNAMTTR 
KKETYQSVYNWQYVHCLFLWCRVLSTAGPSEA 
LQPLVYPLAQVnGCIKLIPTARFYPLRMHCIRALT 
LLSGSSGAFIPVLPFILEMFQQVDFNRKPGRMSSK 
PINFSVILKLSNVNLQEKAYRDGLVEQLYDLTLE 
YLHSOAHCIGFPELVLPWLOLKSFLRECKVANY 
.CRQVQQLLGKVQENSAYICSRRQRVSFGVSEQQ . 
rAVEAWEKl,TR^GTPLTLYYSHWRKLRDREIQL 
EISGKERLEDLNFPEIKRRKMADRKDEDRKQFKD 
LFDLNSSEEDDTEGFSERGILRPLSTRHGVEDDEE 
DEEEGEEDSSNSEDGDPDAEAGLAPGELQQLAQ 
GPEDELEDLQLSEDD 


3050 


A 


870 


182 


HLDRYIKSPGSGSSTPAPPSHLLLYLLHPOSTRTM 

GCCGCSRGCGSGCGGCGSSCGGCGSGCGGCGSG 

RGGCGSGCGGCSSSCGGCGSRCYVPVCCCKPVC 

SWVPACSCTSCGSCGGSKGGCGSCGGSKGGCGS 

CGCSQSSCCKPCCCSSGCGSSCSQSSCCKPCCCSS 

GCGSSCCQSSCCKPYCCQSSCCKPCSCFSGCGSS 

CCQSSCYKPCCCQSSCCVPVCCQCKI 


3051 


A 


175 


4330 


NIPRWNFQGKSFGWLVHFSSEEVDMASDSPARS 
LDEIDLSALRDPAGIFELVELVGNGTYGQVYKGR 
HVKTGQLAAIKVMDVTGDEEEEIKQEINMLKKY 
SHHRNUTYYGAFIKKNPPGMDDQLWLVMEFCG 
AGSVTDLIKNTKGYTLKEEWIAYICREILRGLSHL 
HQHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQ 

ldrtvgrrntfigtpywmapeyiacdenpdaty 

dfksdlwslgitaiemaegapplcdmhpmralf 

liprnpaprlkskkwskkfqsfmsclvkkhsqrp 

ateqlmkhpfirixipnerqvriqlkdhmrtkkk 

rgekdeteyeysgseeeeeendsgepssilnlpge 

stlrrdflrlqlankersealrrqqleqqqren 

eehkrqllaerqbcrieeqkeqrrrleeqqrreke 

lrkqqereqrrhyeeqmrreeerrraeheqeyi 

RRQLEEEQRQLEILQQQLLHEQALLLEYKRKQLE 
EQRQAERLQRQLKQERDYLVSLQHQRQEQRPVE 
KKPLYHYKEGMSPSEKPAWAKEVEERSRLNRQS 
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S£Qn) 
NO: 



PCT/USOl/04098 



Method 



Predicted 

beginning 

nudeotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



3052 



3053 



A 



3054 



615 



203 



2167 



2212 



Ami^no acid sequence (A-Alaninc C=Cysteine, l>=Aspartic Acid, 
E=Glutaniic Add, F-PlienylalaniDc, G=GIydne, H=Histidine, 
I^Isoleudne, K=Lysine, l^Lcudne, M=Mct»iionine, 
N«A5paragine,P^ProIine, Q-GJutainine,R=Ai^nine, S=Serine. 
T^Threonlne, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon,/=possible nucleotide deletion, 
V-possible nucleotide insertion 



SPAJVlFliKVA]^4RISDPNLPPRSHSFSiSGVQPARTP 
PMLRPVDPQIPHLVAVKSQGPALTASQSVHEQPT 
KGLSGFQEALNVTSHRVEMPRQNSDPTSENPPLP 
TRIEKFDRSSWLRQEEDIPPKVPQRTTSISPALAR 
KNSPGNGSALGPRLGSQPIRASNPDLRRTEPILES 
PLQRTSSGSSSSSSTPSSQPSSQGGSQPGSQAGSSE 
RTRVRANSKSEGSPVLPHEPAKVKPEESRDITRPS 
RPASYKKAIDEDLTALAKELRELRIEETNRPMKK 
VTDYSSSSEESESSEEEEEDGESETHDGTVAVSDI 
PRLPTGAPGSNEQYNVGMVGTHGLETSHADSFS 
GSISREGTLMIRETSGEKKRSGHSDSNGFAGHINL 
PDLVQQSHSPAGTPTEGLGRVSTHSQEMDSGTE 
YGMGSSTKASFTPFVDPRVYQTSPTDEDEEDEES 
SAAALFTSELLRQEQAKLNEARKISVVNVNPTNI 
RPHSDTPEIRKYKKRFNSEILCAALWGVNLLVGT 
ENGLMLLDRSGQGKVYNLINRRRFQQMDVLEG 
LNVLVTISGKKNKLRVYYLSWLRNRILHNDPEV 
EKKQGWITVGDLEGCIHYKVVKYERIKFLVIALK 
NAVEIYAWAPKPYHKFMAFKSFADLQHKPLLVD 
LTVEEGQRLKVIFGSHTGFHVIDVDSGNSYDIYIP 
SHIQGNITPHAIVILPKTDGMEMLVCYEDEGVYV 
NTYGRITKDWLQWGEMPTSVAYIHSNQIMGW 
GEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERN 
DKVFFASVRSGGSSQVFFMTLNRNSMMNW 



MGQVECGGQiaGNQLEDDSEPAEGKVYSSDEE 
KLEASAGDPAGSEQEEEGSGGDSEDDGFLDSSA 
: GGPGALLGPKPKLKGSLGTGAEEGAPVTAGVTA 
PGGKSRRRRTAFtSEQLLELEKEFHCKKYLSLTE 
RSQIAHALKLSEVQVKIWFQNRRAKWKRKAGN 
VSSRSGEPVRNPKIWPIPVHVNRFAVRSQHQQM 



FGVRVPSNTQCLVPSFHCMQTSEWDSECLTSLQP 

LPLPTPPAANEAHLQTAAISLWTWAAVQAIERK 

VEIHSRRLLHLEGRTGTAEKKLASCEKTVTELGN 

QLEGKGAVLGTLLQEYGLLQRRLENLENLLRNR 

NFWILRLPPGDCGDIPKVPVAFDDVSIYFSTPEWE 

KLEEWQKELYKNIMKGNYESLISMDYAINQPDV 

LSQIQPEGEHNTEDQAGPEESEIPTDPSEEPGISTS 

DXLSWKQEEEPQVGAPPESKESDVYKSTYADEE 

LVIKAEGLARSSLCPEVPVPFSSPPAAAKDAFSDV 

AFKSQQSTSMTPFGRPATDLPEASEGQVTFTQLG 

SYPLPPPVGEQVFSCHHCGKNLSQDMLLTHQCS 

HATEHPLPCAQCPKHFTPQADLSSTSQDHASEIP 

PTCPHCARTFTHPSRLTYHLRVHNSTERPFPCPDC 

PKRFADQARLTSHRRAHASERPFRCAQCGRSFSL 

BaSLLLHQRGHAQERPFSCPQCGDDFNGHSALIRH 

QMIHTGERPYPCTDCSKSFMRKEHLLNHRRLHT 

GERPFSCPHCGKSFIRKHHLMKHQRIHTGERPYP 

CSYCGRSFRYKQTLKDHLRSGHNGGCGGDSDPS 

GQPPNPPGPLITGLETSGLGVNTEGLETNQWYGE 
GSGGGVL 



SCGHKSAYGSYTGLQLFWEDGQELLQHQQLQD " 
LRLCVHLRPQSEKVELSLWTLFWGKGEPSAVR 
EKLGKAGFAAASGPGGRPGAERASTVLNILHLT 
AESRWEPNACNRVSSSPAGVGPLDLPVGPTJ.VFP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C=Cysteine, D=Aspartic Acid, 
EX?lutamic Acid, F=PhenylaIanine, G^GIycine, H^^Histidine, 
I^Isoleucine, K»Lysine, L^Leodne, M^Metbionine, . 
N=Aspara^ne, P=ProUne, Q=Glutamine, R=Arginine, S=^erine, 
T»Threoiiine, V«VaIine, W^Tryptophan, Y=Tyrosine, 
X^Unknown, *«Stop codon, /^ptnsible nucleotide deletioui 
^possible Ducleotide insertion 










APWARASFLCHAFQia>LTGIGLNTVRFTSEFPLH 

SKDPTAHKLLFTGNYLCKLHPRPRHAPQGSLSDF 

CHGTEGKDLPSEHNVSVEGVAQDRSPEATLCPQ 

KTCPCDICGLRLKDILHLAEHQTraPRQKPFVCE 

AYVKGSEFSANLPRKQVQQNVHNPIRTEEGQAS 

PVKTCRDHTSDQLSTCREGGKDFVATAGFLQCE 

VTPSDGEPHEATEGVVDFHIALRHNKCCESGDAF 

NNKSTLVQHQRfflSRERPYECSKCGIFFTYAADL 

TQHQKVHNRGKPYECCECGKFFSQHSSLVKHRR 

VHTGESPHVCGDCGKFFSRSSNLIQHKRVHTGEK 

PYECSDCGKFFSQRSl^IHHKRVHTGRSAHECSE 

CGKSFNO^SSLIKHWRVHTGERPYKCNECGKFFS 

HIASLIQHQIVHTGERPHGCGECGKAFIRSSDLMK 

HQRVHTGERPYECNECGKLFSQSSSLNSHRRLHT 

GERPYQCSECGKFFNQSSSLNNHRRLHTGERPYE 

CSECGKTFRQRSNLRQHLKVHKPDRPYECSECG 

KAFNQRPTLIRHQKIHIRERSMENVLLPCSQHTPE 

ISSENRPYQGAVNYKLKLVHPSTHPGEVP 


3055 


A 


268 


2954 


ARRSSSSQGSAAPTPCQWEASRDQLVAGPSGK 

MGNREMEELIPLVNRLQDAFSALGQSCLLELPQI 

AVVGGQSAGKSSVLENFVGRDFLPRGSGIVTRRP 

LVLQLVTSKAEYAEFLHCKGKKFTDFDEVRLEIE 

AETDRVTGMNKGISSIPINLRVYSPHVLNLTLIDL 

PGITKVPVGDQPPDIEYQIRMIMQFITRENCLILA 

VTPANTDLANSDALPCLAKEVDPQGLRTIGVITKL 

DLMDEGTDARDVLENKLLPLRRGYVGVVNRSQ 

KDroGBXDKAAMIAERKFFLSHPAYRHIADRM^ 

GTPHLQKVLNQQLTbemDTLPNFRNKLQGQLLS 

BEHEVEAYKNFKPEDPTRKTKALLQMVQQFAVD 

FEKRIEGSGDQVDTLELSGGAKINRIFHERFPFEIV 

KMEFNEKELRREISYAIKNfflGIRTGLFTPDMAFE 

AIVKKQIVKLKGPSLKSVDLVIQELINTVKKCTK 

KLANFPRLCEEIERIVANHIREREGKTKDQVLLLI 

DIQVSYINTNHEDFIGFANAQQRSSQVHKKTTVG 

NQVBRKGWLTISNIGIMKGGSKGYWFVLTAESLS 

WYKDDEEKEKKYMLPLDNLKVRDVEKSFMSSK 

HIFALFNTEQRNVYKDYKFLELACDSQEDVDSW 

KASLLRAGVYPDKSVGNNKAENDENGQAENFS 

MDPQLERQVETIR>ILVDSYMSIINKCIRDLIPKTI 

MHLMINNVKDFINSELLAQLYSSEDQNTLlVffi 

AEQAQRRDEMLRMYQALKEALGnGDIGTATVS 

TPAPPPVDDSWIQHSRRSPPPSPtTQRRPTLSAPL 

ARPTSGRGPAPAIPSPGPHSGAPPVPFRPGPLPPFP 

SSSDSFGAPPQVPSRPTRAPPSVPSRRPPPSPTRPn 

IRPLESSLLD 


3056 


A 


1674 


1839 


VVRVTCCTPARSTTERTOAYDEEDCVEMVASGG 
WNDVACHTTMYFMCEFDBCKNM 


3057 


A 


1674 


1839 


WRVTCCPPARSTTERTNAYDEEDCVEMVASGG 
W>roVACHTTMYFMCEFDKKNM 


3058 


A 


3363 


2525 


FLVKLILIILCRCLHSLSRSVQQLRTSFQDHAVWK 

PLMKVLQNAPDEILWASSMLCNLLLEFSPSKEPI 

LESGAVELLCGLTQSENPALRVNGIWALMNMAF 

QAEQKIKADILRSLSTEQLFRLLSDSDLNVLMKT 

LGLLKNLLSTOPHIDKIMSTHGKQIMQAVTLILEG 

EHNIE\^QILCILANIADGTTAKDL1MTNDDILQ 
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1 SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue oi 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid reslHiiA ni 
peptide 
sequence 


Ammo aao sequence (A-Aianine C=^Cystelne, D-Aspartic Add, 
E/--v>iuraniic Acia, I'-rtienylalanine, G=Glycine, H=Hlstidine, 
I=boleuaDe, K^Lysine, L=Leucine, M=Methlonine, 
N=Asparagin<^ P=ProKne, Q=Glutaniine, R=ArgiDine. S=^erine, 
T^Threoniiie,V=Valine,W=Tn'ptophan,Y=TyrosiDe, 
X-Unknown, *=Stop codon, /possible nneleotide deleOoii. 
H>ossible nucleotide insertion 










KIKYYMGHSHVKLQLAAMI-CiSNLIWNEEEGSQ " 

ERQDKLRDMGIVDILHKLSQSPDSNLCDKAKMA 
LQQYLA 


3059 


A 

A 


679 
1 if\ 


167 


SSWPSLSSQMHEPSFHLHVAAHYGRDSFVRLLLE 

FKAEVDPLSDKGTTPLQLAIIRERSSCVKILLDHN 

ANIDIQNGFLLRYAVKSNHSYCRMFLQRGADTN 

LGRLEDGQTPLHLSALRDDVLCARMLYNYGAD 

TNTRNYEGQTPLAVSISISGSSRPCLDFLOF.VT,<?M 


3060 




1 30 


234 


FFLQLDMDPWCYCADGDSCICAGSCKCKECKCT 

SCKKSCCSCCPAQCAKCAQGCICKGATDKCSCC 
A 


3061 


A 


428 


720 


VRRDVRQQATWAMASDLDFSPPEVPEPTFLENL 

LRYGLFLGAIFQLICVLAnVPIPKSHEAEAEPSEPR 

SAEVTRKPKAAVPSVNKRPKKETKKKR 


[3062 


A 


1589 


276 


WKQKYEPLGLDAAGIEEAIIAVGSFELKANELLO 
VIDSSMKNFKAFFRWLYVAMLRMIEDHVLPELN 
KMTQKDrrFVAHTLTEHFNEAPDLYNRKGKYFN 
VERVGQYLKDEDDDLVSPPNTEGNQWYDFLQN 
SSHLKESPLLFPYYPRKSLHFVKRRMEMroOCLO 
KPADVIGKSMNQAICIPLYRDTRSEDSTRRLFKFP 
FLWNNKTSNLHYLLFTILEDSLYKMCE,RRHTDIS 
QSVSNGLIAIKFGSFTYATTEKVRRSrYSCLDAOF 
YDDETVTWLKDTVGREGRDRLLVQLPLSLVYN 
SEDSAEYQFTGTYSTRLDEQCSAIPTRTMHFEKH 
WRLLESMKAQYVAGNGFRKVSCVLSSNLRHVR 
VFEMDEDDEWELDESSDEEEEASNKPVKIKEEVL 
: SESEAENQQAGAAALAPEIVIKVEKLDPFJ.r>5? • - - 


3063 


A . . - 

A 1 


50 . 


849. 


DKMPSIFAYQSSEVDWCESNFQYSELVAEFYNTF 

SNIPFFFGPLlVmiLIjyiHPYAQKRSRYrifVVWVLF 

MnOLFSMYFHMTLSFLGQLLDEIAILWLLGSGYS 

IWMPRCYFPSFLGGMRSQFIRLVFITTWSTLLSFL 

RPTVNAYALNSIALHILYIVCQEYRKTSNKELRH 

LIEVSWLWAVALTSWISDRLLCSFWQRIHFFYL 

HSIWHVLISITFPYQMVTMALVDANYEMPGETL 

KVRYWPRDSWPVGLPYVEIRGDDKDC 




1 


1523 


925 


AArMADGQMPFSCHYPSRLRRDPFRDSPLSSRLL 

DDGFGMDPFPDDLTASWPDWALPRLSSAWPGTL 

RSGMVPRGPTATARFGVPAEGRTPPPFPGEPWK 

VCVNVHSFKPEELMVKTKDGYVEVSGKHEEKO 

QEGGIVSKNFTKKIQLPAEVDPVTVFASLSPEGll 

IIEAPQVPPYSTFGESSFNNELPODSOEVTCT 


3065 


A 


230 


2929 

] 

j 

1 
I 

5 
( 
I 


LSISLTGSHLFSLGNHSTRENLNAGNFNFPSEGH 

LVRSTGPGGSFAKHMVAQCVSPKGPLACSRTYF 

FGATHVPYLGGDSKLPKKTEQIRLLSQIYAAVIE 

AVLAGIACYAKTSSLTKAKEVAEQTLGSGLDSFE 

LIPFKAALRSKMTFHIHAVNNQGRIVPLDSEDSLS 

FVKTACMAVYDffDLLGGNGCLGSWFSESFLTS 

5ILVKEKDGTVTTETSSVVLTAAVPRFCSWLVED 

"^VKLSEKTHQAVRGDESFLGTYLTGGEGAYLY 

3oi-^L,\lo wrctujiN VJtifJrbaGLLFSHCRHGSmSKD 

iMNSISFYDGDSTSTVAALLIDFKSSLLPHLPVHF 
iGSSNFLMIALFPKSKIYQAFYSEVFSLWKQQDN 
5GISLKVIQEDGLSVEQKRLHSSAQKLFSALSQPA 
jEKRSSLKLLSAKLPELDWFLQHFAISSISQEPVM 
llULPVLLQQAEINrraRIESDKVnsiVTGLPGCH 
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SEQID 
NO: 


Method 


Predicted 
beginDing 
nucleotide 
location 
corresponding 
to first amino 
, acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine OCysteine, D=Aspartic Acid, 
EK^Iutamic Acid, F=Phenylalanine, GK^^Iycine, H^Histidine, 
I=Isoleucine, K=Lysine, L^Lcucinc, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutaminc, R=Arginine, S=Serine, 
T=Threoninc V-Valine, W^Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, possible nucleotide deletion, 
^possible nucleotide insertion 










ASELCAFLVTLHKECGRWMVYRQIMDSSECFHA 

AHFQRYLSSALEAQQNRSARQSAYIRKKTRLLV 

VLQGYTDVIDWQALQTHPDSNVKASFnGAITA 

CVEPMSCYMEHRFLFPKCLDQCSQGLVSNWFT 

SHTTEQRHPLLVQLQSLIRAANPAAAFILAENGIV 

TRNEDIELILSENSFSSPEMLRSRYLMYPGWYEG 

KLNAGSVWLMVQICVWFGRPLEKTRFVAKCKA 

IQSSDCPSPFSGNIYHILGKVKFSDSERTMEVCYNT 

LANSLSIMPVLEGPTPPPDSKSVSQDSSGQQECYL 

VFIGCSLKEDSDCDWLRQSAKQKPQRKALKTRG 

MLtQQEIRSIHVBaiHLEPLPAGYFi^GTQFVNFF 

GDKTDFHPLMDQFMNDYVEEANREIEKYNQELE 

QQEYHDLFELKP 


3066 


A 


130 


588 


LAPLRCQPGTRTQPRSHPAANDPSAAMSAAGAR 

GLRATYHRLLDKVELMLPEKLRPLYNHPAGPRT 

VFFWAPIMKWGLVCAGLADMARPAEKLSTAQS 

AVLMATGnWSRYSLVUPKNWSLFAVNFFVGAA 

GASQLFRTWRYNQELKAKAHK 


3067 


A 


2 


1016 


EFARRRVFIAAREMSLLRSLRVFLVARTGSYPAG 

SLLRQSPQPRHTFYAGPRLSASASSKELLMKLRR 

KTGYSFVNCKKALETCGGDLKQAEIWLHKEAQ 

KEGWSKAAKLQGRKTKEGLIGLLQEGNTTVLVE 

VNCETDFVSRNLKFQLLVQQVALGTIVIMHCQTL 

KDQPSAYSKGFLNSSELSGLPAGPDREGSLKDQL 

ALAIGKLGENMILKRAAWVKVPSGFYVGSYVHG 

AMQSPSLHKLVLGKYGALVICETSEQKTNLEDV 

'GiaiLGQHWGMAPKVGSLDbEPGGEAETKML 

SQPYLLDPSin-GQYVQPQGVSVVbFVRFECGEG 

EEAAETE 


3068 


A 


3 


1679 


NSRVWGPWTEPSAGSLRPMARKQNRNSKELGL 

VPLTDDTSHAGPPGPGRALLECDHLRSGVPGGR 

RRKDWSCSLLVASLAGAFGSSFLYGYNLSWNA 

PTPYIKAFYNESWERRHGRPIDPDTL'ILLWSVTV 

SIFAIGGLVGTLrVKMGKVLGRKHTLLANNGFAI 

SAALLMACSLQAGAFEMLIVGRPIMGIDGGVALS 

VLPMYLSEISPKEIRGSLGQVTAIFICIGVFTGQLL 

GLPELLGKESTWPYLFGVIWPAWQLLSLPFLP 

DSPRYLLLEKHNEARAVKAFQTFLGKADVSQEV 

EEVLAESRVQRSIRLVSVLELLRAPYVRWQVVT 

VIVTMACYQLCGLNAJWFYTNSBFGKAGIPPAKIP 

YVTLSTGGIETLAAVFSGLVIEHLGRRPLLIGGFG 

LMGLFFGTLTITLTLQDHAPWVPYLSIVGILAIIAS 

FCSGPGGIPFILTGEFFQQSQRPAAFIIAGTVNWLS 

NFAVGLLFPHQKSLDTYCFLVFATICrrGAIYLYF 

VLPETKNRTYAEISQAFSKRNKAYPPEEKIDSAV 

TDGKINGRP 


3069 


A 


861 


300 


AAGAWSAMPKAKGKTRRQKFGYSVNRKRLNR 

NARRKAAPRJECSHIRHAWDHAKSVRQNLAEMG 

lAVDPNRAWLRKRKVKAMEVDIEERPKELVRK 

PYVLNDLEAEASLPEKKGNTLSRDLE)YVRYMV 

ENHGEDYKAMARDEK^^yYQDTPKQIRSKINVY 

KRFYPAEWQDFLDSLQKRKMEVE 


3070 


A 


325 


2019 


LAEPEVATDSGQQADLPAEGGDPRAEASCSVLH 
SKPHAMADSRDPASDQMQHWKEQRAAQKADV 
LTTGAG14PVGDKLNVITVGPRGPLLVQDVVFTD 
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SEQID 
NO: 


1 Method 


1 Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 
1 acid residue of 

peptide 

sequence 


1 Predicted end 

1 nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid scqnence (A-AUnine C=Cysteine, I>Aspartic Acid, 
E-Glutamic Acid, F=PheDylalanine, G=Glycine, H=HisHdine, 
I-Isoleuclne, K<=Lysine, L=Lcucine, M^^ethionine, 
I^AqiaragincF>'Frolinc, Q=Glutamine, R=Arginine, S=Serine, 
T^OIireonine, V=Valine, W^Tryptoplian, Y=Tyrosine, 
X°i;iikiMwn, *=Stop codoD,/=>pi>ssible nDcleotide deletion, 
\P¥Ossible nucleotide insertioo 










EMAHFOREREPERWHAKGAGAFGYFEVTHDIT 

KYSKAKVFEHIGKKTPIAVRFSTVAGESGSADTV 

RDPRGFAVBCFYTEDGNWDLVGNNTPIFFIRDPILF 

PSFIHSQKKNPQTHLKDPDMVWDFWSLRPESLH 

QVSFLFSDRGffDGHRHMNGYGSKmFKLVNANG 

EAVYCKFHYKTDQGIKNLSVEDAARLSQEDPDY 

GIRDLFNAIATGKYPSWTFYIQVMTFNQAETFPF 

NPFDLTKVWPHKDYPLIPVGKLVLNRNPVNYFA 

EVEQIAFDPSNMPPGIEASPDKMLQGRLFAYPDT 

HRHRLGPNYLHIPVNCPYRARVANYQRDGPMC 

MQDNQGGAPNYYPNSFGAPEQQPSALEHSIQYS 

GEVRRFNTANDDNVTQVRAFYVNVLNEEQRKR 

LCENIAGHLKDAQIFIQKKAVKNFTEVHPDYGSH 

IQALLDKYNAEKPKNAIHTFVQSGSHLAAREKA 
NL 


3071 


A 


1 


1187 


SLGWLERPPALSRAAGDGARKJLSGSRRGDVWLT 

SSAAGLLRSVAGGSWCGQQLRARGGSGRCVAR 

AMTGNAGEWCLMESDPGVFTELIKGFGCRGAQ 

VEEIWSLEPENFEKLKPVHGLEFLFKWQPGEEPA 

GSVVQDSRLDTIFFAKQVINNACATQAIVSVLLN 

CTHQDVHLGETLSEFKEFSQSFDAAMKGLALSN 

SDVIRQVHNSFARQQMFEFDTKTSAKEEDAFHF 

VSYVPVNGRLYELDGLREGPIDLGACNQDDWIS 

AVRPVIEKRIQKYSEGEIRFNLMAIVSDRKMIYEQ 

KIAELQRQLAEEEPMDTDQGNSMLSAIQSEVAK 

NQNILIEEEVQKLKRYKiENIRRKHNYLPFIMELL 

KTLAEHQQLiPtVEKAKEKQNAKKAOETK 


3072 
3073 > 


A 

V < 


103 

57 \: 


2775 

J 
1 

j 

i415 I 


RLRTLAPPGLLLGPPLVPDSRRRHQASLTPLHISG 

SPQLVGRGDRKLRTEVLVPPAALPAETRQRRSER 

LPRRTCPRGGAPGPGRSRLPRSLPPPSAIPGLRSPV 

WAAGLGGGGRREPSRGKGGAALRARHRSTMAE 

LGAGGDGHRGGDGAVRSETAPDSYKVQDBCKNA 

SSRPASAISGQNNNHSGNKPDPPPVLRVDDRQRL 

ARERREEREKQLAAREIVWLEREERARQHYEKH 

LEERKKRLEEQRQKEERRRAAVEEKRRQRLEED 

KERHEAWRRTMERSQKPKQKHNRWSWGGSLH 

GSPSmSADPpRRSVSlMNLSKYVDPVISKRLSSS 

SATLLNSPDRARRLQLSPWESSWNRLLTPTHSF 

LARSKSTAALSGEAVIPICPRSASCSPIIMPYKAAH 

SRNSMDRPKLFVTPPEGSSRRRIIHGTASYKKERE 

RENVLFLTSGTRRAVSPSNPKARQPARSRLWLPS 

KSLPHLPGTPRPTSSLPPGSVKAAPAQVRPPSPGN 

mPVKREVKVEPEKKDPEKEPQKVANEPSLKGRA 

PLVKVEEATVEERTPAEPEVGPAAPAMAPAPAS 

APAPASAPAPAPVPTPAMVSAPSSTVNASASVKT 

SAGTTDPEEATRLLAEKRRLAREQREKEERERRE 

QEELERQKREELAQRVAEERTTRREEESRRLEAE 

QAREKEEQLQRQAEERALREWEEAERAQRQKEE 

EARVREEAERVRQEREKHFQREEQERLERKKBL 

BiiiiViRj^. 1 KK 1 JbATDKKTSDQRNGDIAKGALTGG 

rEVSALPCTTNAPGNGKPVGSPHWTSHQSKVT 

^STPDLEKQPNENGVSVQNENFEEIINLPIGSKP 

5RLDVTNSESPEIPLNPILAFDDEGTLGPLP0VDG 

/QTQQTAEVI 

WVCRDHVCLICWDPIAGTGGSRSIMPALPLDO 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nncleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc OCysteine, D=Aspartic Add, 
E^Glutamic Acid, F=PhenyIalanine, G=Glycine, H-Histidine, 
I=lsoleucine, K=Lysine, L=Leucine, M=Methionine, 
N^'Asparagine, P=Proline, Q=GlQtamine, R=Arginlne, S=Scrine, 
T^Tbreonine, VaValine, W=Tryptophan, Y=Tyrosine, 
X^UnknowD, *^top codon, A^possible nucleotide deletion, 
\=possible nucleotide insertion 










LQITHKDPKTGKLRTSPALHPEQKADRYFVLYKP 

PPKDNIPALVEEYLERATFVANDLDWLLALPHD 

KFWCQVIFDETLQKCLDSYLRYVPRKFDEGVAS 

APEVVDMQKRLHRSWLTFLRMSTHKESKDHFIS 

PSAFGEELYNNFLFDIPKILDLCVLFGKGNSPLLQ 

KMIGNIFTQQPSyYSDLDETLPTILQVFSNILQHC 

GLQGIXjANTTPQKLEERGRLTPSDMPLLELKDIV 

LYLCDTCTTLWAFLDIFPLACQTFQKHDFCYRLA 

SFYEAAIPEMESAIKKRRLEDSKLLGDLWQRLSH 

SRKKLMEBFHIILNQICLLPILESSCDNIQGFIEEFL 

QEFSSLLQEKRFLRDYDALFPVAEDISLLQQASSV 

ldetrtayilqavesawegvdrrkatdakdpsv 

ieepngepngvtvtaeavsqasshpenseeeecm 

gaaaavgpamcgveldslisqvkdllpdlgegfi 

lacleyyhydpeqvinnileerlaptlsqldrnl 

dremkpdptplltsrhnvfqndefdvfsrds vdl 

srvhkgkstrkeentrsllndkravaaqrqrye 

qyswveevplqpgeslpyhsvyyedeyddtyd 

gnqvgandadsddelisrrpfttpqvlrtkvpre 

gqeeddddeeddadeeapkpdhfvqdpavlrek 

aearrmaflakkgyrhdsstavagsprghgqs 

rettqerrkkeankatraneinrrtmadrkrsk 

gmips 


3074 


A 


3 


251 


gearspppaaalldmdpetcpcpsggsctcadsc 
kcegckctsckksccsccpaecekcakdcvckg 
, geaaeaeaekcsccq . 


3075 


A . . - 


.255' ^r: • 


982 ,^^^^3. 


>:SQESisSQyLyDSAEEGSLAAAAELAAQKREQRL 

vRKFRELHLMRNEARKLNHQ 
WEAKKARLEWELKEEEKKKECAARGEDYEKVK 
LLEISAEDAERWERKKKRKNPDLGFSDYAAAQL 
RQYHRLTKQDCPDMETYERLREKHGEEFFPTSNS 
LLHGTHVPSTEEIDRMVIDLEKQIEKRDKYSREIR 
PYNDDADIDYINERNAKFNKKAERFYGKYTAEI 
KQNLERGTAV 


3076 


A 


255 


982 


sqfslsqvlvdsaeegslaaaaelaaqkreqrl 

RJKJRELHLMRNEARKLNHQEVVEEDKRLKLPAN 

weakkarlewelkeeekkkecaargedyekvk 
lleisaedaerwerkkkrknpdlgfsdyaaaql 

RQYHRLTKQIKPDMETYERLREKHGEEFFPTSNS 
LLHGTHVPSTEEBDRMVIDLEKQIEKRDKYSRRR 
PYNDDADroYINERNAKFNKKAERFYGKYTAEI 
KQNLERGTAV 


3077 


A 


1 


968 


FRLRPRRACAQLLWHPAAGMASWAKGRSYLAP 

GLLQGQVAIVTGGATGIGKAIVKELLELGSNWI 

ASRKLERLKSAADELQANLPPTKQARVIPIQCNIR 

NEEEVNNLVKSTLDTFGKINFLVNNGGGQFLSPA 

EmSSKGWHAVLETNLTGITYMCKAVYSSWMK 

KHGGSIWnVPTKAGFPLAVHSGAARAGVYNLT 

KSLAFEWACSGIRENCVAPGVrYSQTAVENYGSW 

GQSFFEGSFQKIPAKRIGVPEEVSSVVCFLLSPAA 

SFITGQSVDVDGGRSLYTHSYEVPDHDNWPKGA 

GDLSVVKKMKETFKEKAKL 


3078 


A 


2 


3508 


FVRESGKAPVTFDDITVYLLQEEWVLLSQQQKEL 

CGSNKLVAPLGPTVANPELFRKFGRGPEPWLGS 

VQGQRSLLEHHPGKKQMGYMGEMEVQGPTRES 
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SEQID 
NO: 



PCT/USOl/04098 



Method 



I Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 
I peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Aimno acid sequence (A-Alanme OCysteinc, D=Aspartic Acid, 
EF=Glutaraic Acid, F=Pheny!alaninc, G=G!ycine, H=Histidine, 
I«Isolcucine, K=Lysine, L=Leucine, M«Methioninc, 
N=Asparagine, P^ProIine, Q=GIutamioe, R^Arginine, S-Serine, 
T«Threomne, V=Valine, W=Tryptophan, Y^Tyrosine. 
X<-Unknown, *^top codon, ^possible nucleotide deletion. 
V=possible n udeotide insertion 



GQSLPPQKKAYLSHLSTGSGHIEGDWAGRNRKL 

LKPRSIQKSWFVQFPWLIMNEEQTALFCSACREY 

PSIRDKRSRLIEGYTGPFKVE1LKYHAKSKAHMF 

CVNALAARDPIWAARERSIRDPPGDVLASPEPLF 

TADCPIFYPPGPLGGFDSMAELLPSSRAELEDPGG 

DGAIPA]Vrn.DCISDLRQKEITDGIHSSSDINILYN 

DAVESCIQDPSAEGLSEEVPWFEELPWFEDVA 

VYFTREEWGMLDKRQKELYIUDVMRMNYELLAS 

LGPAAAKPDLISKLERRAAPWIKDPNGPKWGKG 

RPPGNKKMVAVREADTQASAADSALLPGSPVEA 

RASCCSSSICEEGDGPRJRIKRTYRPRSIQRSWFGQ 

FPWLVIDPKETKLFCSACIERPNLHDKSSRLVRG 

YTGPFKVETLKYHEVSKAHRLCVNTVEIKEDTPH 

TALVPEISSDLMANMEHFFNAAYSIAYHSRPLND 

FEKILQLLQSTGTVILGKYRNRTACTQFIKYISETL 

KREILEDVRNSPCVSVLLDSSTDASEQACVGIYIR 

YFKQMEVKESYITLAPLYSETADGYFETIVSALD 

ELDIPFRKPGWWGLGTDGSAMLSCRGGLVEKF 

QEVIPQLLPVHCVAHRLHLAWDACGSIDLVKK 

CDRHIRTVFKFYQSSNKRLNELQEGAAPLEQEIIR 

LBCDLNAVRWVASRRRTLHALLVSWPALARHLQ 

RVAEAGGQIGHRAKGMLKLMRGFHFVKFCHFL 

LDFLSIYRPLSEVCQKEIVLITEVNATLGRAYVAL 

ESLRHQAGPKEEEFNASFKDGRLHGICLDKLEVA 

EQRFQADRERTVLTGIEYLQQRFDADRPPQLKN 

MEVFDTMAWPSGIELASFGNDDILNLARYFECSL 

:pxgyseealleewlglktiaqhlpfsmlcknala 

QHCRFPLLSKLMAVWCVPISTSCCERGFKAMN 
RIRTDERTKLSNEVLNMLMMTAVNGVAVTEYD 
PQPAJQHWYLTSSGRRFSHVYTCAQVPARSPASA 
RLRKEEMGALYVEEPRTQKPPILPSREAAEVLKD 
CIMEPPERLLYPHTSQEAPGMS 



1513 



FSl'i.JSi'KJLCSLGGWGALQAGEPCQPSRAGCGRE 
GATMGCTLSAEERAALERSKAIEKNLKEDGISAA 
KDVKLLLLGAGESGKSTIVKQMKIIHEDGFSGED 
VKQYKPVVYSNTIQSLAAIVRAMDTLGIEYGDK 
ERKADAKMVCDVVSRI^DTEPFSAELLSAMMR 
LWGDSGIQECFNRSREYQLNDSAKYYLDSLDRIG 
AADYQPTEQDILRmVKTTGIVETHFTFKNLHFR 
LFDVGGQRSERKKWIHCFEDVTAIffCVALSGYD 
QVLHEDETTNRMHESLKLFDSICNNKWFTDTSII 
LFLNKKDIFEEKIKKSPLTICFPEYTGPSAFTEAVA 
YIQAQYESKNKSAHKEIYSHVTCATDTNNIOFVF 
DAVTDVnAKNLRGCGLY 



997 



1996 



EARTAm.llXjVTDGLTMADQPKPISPLKNLLA 

GGFGGVCLVFVGHPLDTVKVRLQTQPPSLPGQPP 

MYSGTFDCFRKTLFREGITGLYRGMAAPnGVTP 

MFAVCFFGFGLGKKLQQKHPEDVLSYPQLFAAG 

MLSGVFTTGIMTPGERIKCLLQIQASSGESKYTGT 

LDCAKKLYQEFGIRGIYKGTVLTLMRDVPASGM 

YFMTYEWLKNIFTPEGKRVSELSAPRILVAGGIA 

GIFNWAVAIPPDVLKSRFQTAPPGKYPNGFRDVL 

RELIRDEGVTSLYKGFNAVMIRAFPANAACFLGF 
EVAMKFLNWATPNL 



IMADMEDLFGSDADSEAERKDSDSGSDSDSDQE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nocleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid seqnence (A=Alanine C^Cysteine, IMAspartic Acid, 
£=Glutamic Add, F^Pbenylalanine, G=Glycine, H=Histidine, 
I^Isoleucine, K=Lysine, L=Leudne, M=Methionine, 
N-Asparagine, P=ProIine, Q=C]utamine, R=Arginine, S^erine, 
T«Threonine, V=Valine, W^Tryptophan, Y=Tyroslne, 
X=UDknown, *^top codon, A==posstbIe nudeotide delctioD, 
\«possibIe nudeotide insertion 










NAASGSNASGSESDQDERGDSGQPSNKELFGDD 

SEDEGASHHSGSDNHSERSDNRSEASERSDHEDN 

DPSDVDQHSGSEAPNDDEDEGHRSDGGSHHSEA 

EGSEKAHSDDEKWGREDKSDQSDDEKIQNSDDE 

ERAQGSDEDKLQNSDDDEKMQNTDDEERPQLS 

DDERQQLSEEEKANSDDERPVASDNDDEKQNSD 

DEEQPQLSDEEKMQNSDDERPQASDEEHRHSDD 

EEEQDHKSESARGSDSEDEVLRMKRKNAIASDSE 

ADSDTEVPKDNSGTMDLFGGADDISSGSDGEDK 

PFTPGQPVDENGLPQDQQEEEPIPETRIEVEIPKV 

NTDLGNDLYFVKLPNFLSVEPRPFDPQYYEDEFE 

DEEMLDEEGRTRLKLKVENTIRWRIRRDEEGNEI 

KESNARIVK WSDG SMSLHLGNEVFD VYKAPLOG 

DHNHLFIRQGTGLQGQAVFKTKLTFRPHSTDSAT 

HRKMTLSLADRCSKTQKIRILPMAGRDPECQRTE 

MIKKEEERLRASIRRESQQKRMREKQHQRGLSAS 

YLEPDRYDEEEEGEESISLAAlkNRYKGGIREERA 

RIYSSDSDEGSEEDKAQRLLKAKKLTSDEVRPNL 

FNSRGLSCTQEPTALNEELTDQAGTN 


3082 


A 


3 


921 

' 


VEFCLPASADSSSLVAASLAGVRKMATNFLAHE 

KIWFDKFKYDDAERRFYEQMNGPVAGASRQEN 

GASVELRDIARARENIQKSLAGSSGPGASSGTSGD 

HGELWRIASLEVENOSLRGWOELOOAT^KT FA 

RLNVLEKSSPGHRATAPQTQHVSPMRQVEPPAK 

KPATPAEDDEDDDIDLFGSDNEEEDKEAAQLREE 

RLRQYAEKKAKKPALVAKSSILLDVKPWDDETD.. 

MA'QLEACVRSIQLDGLWGASli.WVGYGi^ ^ 

Qi(y:VVEDDKVGTbLLEEEITKFEEHVQSVDI^ " 

FNKI 


3083 


A 


3 


921 


VEFCLPASADSSSLVAASLAGVRKMATNFLAHE 

KIWFDKFKYDDAERRFYEQMNGPVAGASRQEN 

GASVILRDIARARENIQKSLAGSSGPGASSGTSGD 

HGELVVRIASLEVENOSLRGWOELOOAISKLEA 

RLNVLEKSSPGHRATAPQTQHVSPMRQVEPPAK 

KPATPAEDDEDDDIDLFGSDNEEEDKEAAQLREE 

RLRQYAEKJCAKKPALVAKSSILLDVKPWDDETD 

MAQLEACVRSIQLDGLVWGASKLVPVGYGIRKL 

QIQCWEDDKVGTDLLEEEITKFEEHVQSVDIAA 

FNKI 


3084 


A 


128 


4050 


KSIVKIRKRMAAETQTLNFGPEWLRALSSGGSITS 

PPLSPALPKYKLADYRYGREEMLALFLKDNKIPS 

DLLDKEFLPILQEEPLPPLALVPFTEEEQRNFSMS 

VNSAAVLRLTGRGGGGTWGAPRGRSSSRGRGR 

GRGECGFYQRSFDEVEGVFGRGGGREMHRSQS 

WEERGDRRFEKPGRKDVGRPNFEEGGPTSVGRK 

HEFIRSESENWRIFREEQNGEDEDGGWRLAGSRR 

DGERWRPHSPDGPRSAGWREHMERRRRFEFDFR 

DRDDERGYRRVRSGSGSIDDDRDSLPEWCLEDA 

EEEMGTFDSSGAFLSLKKVQKEPIPEEQEMDFRP 

VDEGEECSDSEGSHNEEAKEPDKTNKKEGEKTD 

RVGVEASEETPQTSSSSARPGTPSDHQSQEASQFE 

RKDEPKTEQTEKAEEETRMENSLPAKVPSRGDE 

MVADVQQPLSQIPSDTASPLLILPPPVPNPSPTLRP 

VETPWGAPGMGSVSTEPPDEEGLKHLEQQAEK 

MVAYLQDSALDDERLASKLQEHRAKGVSIPLMH 
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SEQm 
NO: 



3085 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, " 
E=Glutamic Acid, F=PhenyIalanine, G=Glycinc, H^HisUdine, 
I=Isoleucinc, K»Lysine, L»Leucine, M=Mettiionine, 
N^AsparagincP^Proline, Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y«Tyrosine, 
X=Unknown» *=Stop codon, ^possible nucleotide deletion, 
Nppossibie nucleotide insertion 



128 



4050 



EAMQKWYYKDPQGEIQGPFNNQEMAEWFQAG 

YFTMSLLVKRACDESFQPLGDIMKMWGRVPFSP 

GPAPPPHMGELDQERLTRQQELTALYQMQHLQY 

QQFLIQQQYAQVLAQQQKAALSSQQQQQLALLL 

QQFQTLKMRISDQNIIPSVTRSVSVPDTGSIWELQ 

PTASQPWWBGGSVWDLPLDTTTPGPALEQLQQ 

LEKAKAAKLEQERREAEMRAKREEEERKRQEEL 

RRRQKGILRRQQEEERKRREEEELARRKQEEALR 

RQREQEIALRRQREEEERQQQEEALRRLEERRRE 

EEERRKQEELLRKQEEEAAKWAREEEEAQRRLE 

ENRLRMEEEAARLRHEEEERKRKELEVQRQKEL 

MRQRQQQQEALRRLQQQQQQQQLAQMKLPSSS 

TWGQQSNTTACQSQATLSLAEIQKLEEERERQLR 

EEQRRQQRELMKALQQQQQQQQQKLSGWGNV 

SKPSGTTKSLLEIQQEEARQMQKQQQQQQQHQQ 

PNRARIWraSNLHTSIGNSVWGSINTGPPNQWA 

SDLVSSIWSNADTKNSNMGFWDDAVKEVGPRN 

STNKNKNNASLSKSVGVSNRQNKKVEEEEKLLK 

LFQGVNKAQDGFTQWCEQMLHALNTANNLDVP 

TFVSFLKEVESPYEVHDYIRAYLGDTSEAKEFAK 

QFLERRAKQKANQQRQQQQLPQQQQQPPQQPP 

QQPQQQDSVWGMNHSTLHSVFQTNQSNNQQSN 
FEAVQSGKKICKKQKMVRADPSLLGFSVNASSER 
LNMGEIETXDDY , 



KSIVKIRJ?;jlMAAETQTLNFGPEWLRALSSGGSITS 

PPLSPALPKYKLADYRYGREEMLALFLKDNKIPS 

DLLDKEFLPILQEEPLPPLALVPFTEEEQRNFSMS 

VNSAAVLRLTGRGGGGTWGAPRGRSSSRGRGR 

GRGECGFYQRSFDEVEGVFGRGGGREMHRSQS 

WEERGDRRFEKPGRKDVGRPNFEEGGPTSVGRK 

HEFIRSESENWRIFREEQNGEDEDGGWRLAGSRR 

DGERWRPHSPDGPRSAGWREHMERRRRFEFDFR 

DRDDERGYRRVRSGSGSIDDDRDSLPEWCLEDA 

EEEMGTFDSSGAFLSLKKVQKEPIPEEQEMDFRP 

VDEGEECSDSEGSHNEEAKEPDKTNKKEGEKTD 

RVGVEASEETPQTSSSSARPGTPSDHQSQEASQFE 

RKDEPKTEQTEKAEEETRMENSLPAKVPSRGDE 

MVADVQQPLSQJPSDTASPLLILPPPVPNPSPTLRP 

VETPVVGAPGMGSVSTEPDDEEGLKHLEQQAEK 

MVAYLQDSALDDERLASKLQEHRAKGVSIPLMH 

EAMQKWYYKDPQGEIQGPFNNQEMAEWFQAG 

YFTMSLLVKRACDESFQPLGDIMKMWGRVPFSP 

GPAPPPHMGELDQERLTRQQELTALYQMQHLQY 

QQFLIQQQYAQVLAQQQKAALSSQQQQQLALLL 

QQFQTLKMRISDQNnPSVTRSVSVPDTGSIWELQ 

PTASQPTVWBGGSVWDLPLDTTTPGPALEQLQQ 

LEKAKAAKLEQERREAEMRAKREEEERKRQEEL 

RRRQKGILRRQQEEERKRREEEELARRKQEEALR 

RQREQEIALRRQREEEERQQQEEALRRLEERRRE 

EEERRKQEELLRKQEEEAAKWAREEEEAQRRLE 

ENRLRMEEEAARLRHEEEERKRKELEVQRQKEL 

MRQRQQQQEALRRLQQQQQQQQLAQMKLPSSS 

TWGQQSNTTACQSQATLSLAEIQKLEEERERQLR 

EEQRRQQRELIVIKALQQQQQQQQQKLSGWGNV 

SKPSGTTKSLLEIQQEEARQMQKQQQQQQQHQQ 
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SEQD) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, I>='Aspartic Acid, 
E=Glutamic Acid, F^Pheoylalanine, G^Glycioe, H^Histidine, 
I=Isoleucine, K'^Lysine, L=Leuclne, M-Methiooine, 
N=Asparagine, P=Prolinc, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V^Valine, W«Tryptophan, Y«Tyrosine, 
X=UnknowD, *=Stop codon, /^possible nucleotide deletion, 
\-possibIe nucleotide Insertion 










PNRARNNTHSNLHTSIGNSVWGSINTGPPNQWA 

SDLVSSIWSNADTKNSNMGFWDDAVKEVGPRN 

STNK^^KNNASLSKSVGVSNRONKK■VEEEEK^ 

LFQGVNKAQDGFTQWCEQMLHALNTAhJNLDVP 

TFVSFLKEVESPYEVHDYIRAYLGDTSEAKEFAK 

QFLER]RAKQBO\NQQRQQQQLPQQQQQPPQQPP 

QQPQQQDSVWGMNHSTLHSVFQTNQSNNQQSN 

FEAVQSGKKKKKQKMVRADPSLLGFSVNASSER 

LNMGEIETLDDY 


3086 


A 


675 


1334 


lhpaatstawlhvppglsmalswvltvlsli.pl 

LEAQIPLCANLVPVPITNATLDRITGKWFYIASAF 

RNEEYNKSVQEIQATFFYFTPNKTEDTIFLREYQT 

RQDQCIYNTTYLNVQRENGTISRYVGGQEHFAH 

LLILRDTKTYMLAFDVNDEKNWGLSVYADKPET 

TKEQLGEFV1EALDCLRIPKSDVVYTDWKKDKCE 

PLEKQHEKERKQEEGES 


3087 


A 


1 


1575 


CTPVARSMATTATCTRFTDDYQLFEELGKGAFS 

VVRRCVKKTSTQEYAAKIINTKKLSARDHQKLE 

REARICRLLKHPNIVRLHDSISEEGFHYLVFDLVT 

GGELFEDIVAREYYSEADASHCIHQILESVNHIHQ 

HDIVHRDLKPENLLLASKCKGAAVKLADFGLAIE 

VQGEQQAWFGFAGTPGYLSPEVLRKDPYGKPVD 

IWACGVILYILLVGYPPFWDEDQHKLYQQIKAG 

AYDFPSPEWDTVTPEAKNLINQMLTINPAKRITA 

DQALKHPWVCQRSTVASMMHRQETVECLRKFN 










NATOGDCGSTESCimTEDEDLK 

LIEAINNGDFEAYTKICDPGLTSFEPEALGNLVEG 

MDFHKFYFENLLSKNSKPIHTTILNPHVHVIGED 

AACIAYIRLTQYIDGQGRPRTSQSEETRVWHRRD 

GKWLNVHYHCSGAPAAPLQ 


3088 


A 


12 


1039 


SSVAEFPERVQLSQPQNWNFSGAGGAWSLDFAE 

QLKWSAELARLGESIMDGKQGGMDGSKPAGPR 

DFPGIRLLSNPLMGDAVSDWSPMHEAAIHGHQL 

SLRNLISQGWAVNnTADHVSPLHEACLGGHLSC 

VKILLKHGAOVNGVTADWHTPT FNACVSOSWD 

CVNLLLQHGASVQPESDLASPIHEAARRGHVEC 

VNSLIAYGGMDHKISHLGTPLYLACENQQRACV 

KKLLESGADVNQGKGQDSPLHAVARTASEELAC 

LLMDFGADTOAKNAEGKRPVELVPPESPLAOLF 

LEREGPPSLMQLCRLRIRKCFGIQQHHKITKLVLP 

EDLKQFLLHL 


3089 


A 


73 


432 


DMAGLMnVTSLLFLGVCAHHnPTGSVVLPSPCC 
MFFVSKRIPENRVVSYQLSSRSTCLKAGVIFTTKK 
GQQFCGDPKQEWVQRYMKNLDAKQKKASPRA 
RAVAVKGPVQRYPGNQTTC 


3090 


A 


4627 


611 


LMEAGGGGGALPAGVETMVLTLGESWPVLVGR 

RFLSLSAADGSDGSHDSWDVERVAEWPWLSGTI 

RAVSHTDVTKKDLKVCVEFDGESWRKRRWIEV 

YSLLRRAFLVEHNLVLAERKSPEISERIVQWPAIT 

YKPLLDKAGLGSITSVRFLGDQQRVFLSKDLLKP 

IQDVNSLRLSLTDNQIVSKEFQALIVKHLDESHLL 

KGDBCNLVGSEVKIYSLDPSTQWFSATWNGNPA 

SKTLQVNCEEIPALKIVDPSLIHVEWHDNLVTC 
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^EQm 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

IIUUCUUUC 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A~Alanine OCysteine, I)=Aspartic Add, 
E^Glutamic Acid, F^Ptienylalanine, G=Glycine, H=Histidine, 
I=IsoleuciDe, K=Lysinc, L=Leucine, M=Mettiioninc, 
N=Asparagine,P^Prolinc Q=Glutamlne, R^Arginine, S=Serine, 
T=Tlireoninc V=Valine, W=Tryptoplian, Y=Tyrosinc 
X»Unknown, *«Stop.codon, ^-possible nucleotide deletion, 
\^po5sible nucleotide insertion 










GNSARIGAVKRKSSENNGTLVSKQAKSCSEASPS " 

MCPVQSVPTTVFKEILLGCTAATPPSKDPRQQST 

PQAANSPPNLGAKIPQGCHKQSLPEEISSCLNTKS 

EALRTKPDVCKAGLLSKSSQIGTGDLKILTEPKGS 

CTQPKTNTDQENRLESVPQALTGLPKECLPTKAS 

SKAELEIAOTPELQKHLEHAPSPSDVSNAPEVKA 

GVNSDSPNNCSGKKVEPSALACRSQNLKESSVK 

VDNESCCSRSNNKIQNAPSRKSVLTDPAKLKKLQ 

QSGEAFVQDDSCVNIVAQLPKCRECRLDSLRKD 

KEQQKDSPVFCRFFHFRRLQFNKHGVLRVEGFLT 

PNKYDNEAIGLWLPLTK>IVVGIDLDTAKYILANI 

GDHFCQMVISEKEAMSTIEPHRQVAWKRAVKG 

VREMCDVCDTTIFNLHWVCPRCGFGVCVDCYR 

MKRKNCQQGAAYKTFSWLKCVKSQIHEPENLM 

PTQIIPGKALYDVGDIVHSVRAKWGIKANCPCSN 

RQFKLFSKPASKEDLKQTSLAGEKPTLGAVLQQ 

NPSVLEPAAVGGEAASKPAGSMKPACPASTSPLN 

WLADLTSGNVNKEISOKEKQPTMPILKI^IKCL^^ 

PPLSKSSTVLHTFNSTELTPVSNNNSGFLRNLLNSS 

TGKTENGLKNTPKILDDIFASLVQNKTTSDLSKR 

PQGLTDCPSILGFDTPHYWLCDNRLLCLQDPNNK 

SNWNVFRECWKQGQPVMVSGVHHKLNSELWK 

PESFRKEFGEQEVDLVNCRTNEnTGATVGDFWD 

GFEDVPNRLKl^KEPNlVLKIJaDWPPGEDFRDM 

MPSRFDDLMANIPLPEYTRRDGKLNLASRLPNYF 

VRPDLGPKMYNAYGLITPEDRKYGTTNLHLDVS 

DAANVMVYVGDPKGQCEQEEEVLKTIQDGDSDE 

LTIKRFffii3KEKPGALWHIYAAKDTEKIREFLKK 

VSEEQGQENPADHDPIHDQSWYLDRSLRKRLHQ 

EYGVQGWAIVQFLGDWFIPAGAPHQVHNLYSC 

IKVAEDFVSPEHVKHCFWLTQEFRYLSQTHTNHE 

DKLQVK>rVIYHA\^AVAMLKASESSFGKP 


3091. 


A 


97 


1838 


KRGARRGGWKRKMPSTDLLMLKAFEPYLEILEV 

YSTKAKNYVNGHCTKYEPWQLIAWSVVWTLLI 

WGYEFVFQPESLWSRFKKKCFKLTRKMPIIGRK 

IQDKLNKTKDDISKNMSFLKVDKEYVKALPSQG 

LSSSAVLEKLKEYSSMDAFWQEGRASGTVYSGE 

EKLTELLVKAYGDFAWSNPLHPDIFPGLRKffiAEI 

VRIACSLFNGGPDSCGCVTSGGTESILMACKAYR 

DLAFEKGIKTPEIVAPQSAHAAFr^ASYFGMKI 

VRWLTKMMEVDVRAMRRAISRNTAMLVCSTP 

QFPHGVIDPVPEVAKLAVKYKIPLHVDACLGGFL 

IVFMEBCAGYPLEHPFDFRVKGVTSISADTfiKYGY 

APKGSSLVLYSDKKYKNYQFFVDTDWQGGIYAS 

PTLAGSRPGGISAACWAALMHFGENGYVEATKQI 

IKTARFLKSELENIKGIFVFGNPQLSVIALGSRDFD 

lYRLSNLMTAKGWNLNQLQFPPSIHFCITLLHAR 

KRVAIQFLKDIRESVTQIMKNPKAKTTGMGAIYG 

MAQTTVDRNMGAELSSVFLDSLYSTDTVTQGSO 

MJNOox'ivrn 


3092 


A 


79 


2652 

] 


LCSQNSPEDWVNFSSEKQKRYPWYWTGRKLRSE" 

RAMKIQKKLTGCSRLMLLCLSLELLLEAGAGNIH 

YSVPEETDKGSFVGNIAKDLGLQPQELADGGVRI 

VSRGRMPLFALNPRSGSLITARRIDREELCAQSM 

PCLVSFMLVEDKMKLFPVEVEiroiNDNTPQFQL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A<==AJanine C=Cysteine, D=Aspartic Acid, 
E^^GIutamic Acid, F-Phenylalanine, G=Clycine, H-Histidine, 
I-Isoleudne, K=Lysine, L^Leudne^ M==Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threoninc, V^Valinc, W=*Tryptophan, Y«=Tyrosine, 
X=Unknown, *=Stop codon, /=:po55ible nucleotide deletion, 
possible nudeotide insertion 










EELEFKMNEIITPGTRVSLPFGQDLDVGMNSLQS 

YQLSSNPHFSLDVQQGADGPQHPEMVLQSPLDR 

EEEAVHHLDLTASDGGEPVRSGTLRIYIQWDAN 

DNPPAFTQAQYHim^PEhrsa'LGTQLLMVNATDP 

DEGANGEVTYSFHNVDHRVAQIFRLDSYTGEISN 

KEPLDFEEYKMYSMEVQAQDGAGLMAKVKVLI 

KVLDVNDNAPEVTrTSVTTAVPENFPPGTHALISV 

HDQDSGDNGYTTCFIPGNLPFKLEKLVDNYYRL 

VTERTLDRELISGYNITITAIDQGTPALSTETHISL 

LVTDINDNSPVFHQDSYSAYIPENNPRGASIFSVR 

AHDLDSNENAQITYSLIEDTIQGAPLSAYLSINSD 

TGVLYALRSFDYEQFRDMQLKVMARDSGDPPLS 

SNVSLSLFLLDQNDNAPEILYPALPTDGSTGVEL 

APRSAEPGYLVTKVVAVDRDSGQNAWLSYRLL 

KA SEPGLFS VGLHTGF VRT ARALLDRD ALKOSL 

WAVQDHGQPPLSATVTLTVAVADRIPDILADLG 

SLEPSAKPNDSDLTLYLVVAEAAVSCVFLAFVIV 

LLAHRLRRWHKSRLLQASGGGLASTPGSHFVGV 

DGVRAFLQTYSHEVSLTADSRKSHLIFPQPNYAD 

TLISQESCEKKGFLSAPQSLLEDKKEPFSQVNFCD 

ECISYLEKNNS 


3093 


A 


1 


3868 

f r 


ppdnqklglleallkigdwqhaqnimdqmppyy 
aashklialaicklihitieplyrsvtswavdhag 
flesdpcdstvghllsrvgvpkgakgspvnalq 
nkrapkqaesfedlrrdvfnmfcylgphlshdpi 
lfakvvrigksfmkefqsdgskqedkektevils 
. pllsitdqvllpslslmdcnacmseelwgme^t 
Vpyqhryrlygqwknet^ 
rakyimbcrltkenvkpsgrqigklshsnptilfd 

YVCFEILSQIQKYDNLITPVVDSLKYLTSLNYDVL 
ACILSNCIIEALAI^EKERMKHDDTTISSWLQSLA 
SFCGAVFRKYPIDLAGLLQYVANQLKAGKSFDL 
LILKEVVQKMAGIEITEEMTMEQLEAMTGGEQL 
KAEGGYFGQIRNTKKSSQRLKDALLDHDLALPL 
CLLMAQQRNGVIFQEGGEKHLKLVGKLYDQCH 
DTLVQFGGFLASNLSTEDYIKRVPSIDVLCNEFHT 
. PHDAAFFLSRPMYAHfflSSKYDELKKSEKGSKQ 

qhkvhkyitscemvmapvheavvslhvskvwd 

dispqfyatfwsltmydlavphtsyerevnklk 

vqmkaiddnqemppnkkkkekerctalqdkll 

eeekkqmehvqrvlqrlklekdnwllakstkn 

etitkflqlcifprcifsaidavycarfvelvhqq 

ktpnfstllcydrvfsdiiytvascteneasrygr 

flccmletvtrwhsdratyekecgnypgfltil 

ratgfdggnkadqldyenfrhvvhkwhyklt 

kasvhcletgeythtonilivlixilpwypkvlnl 

gqalerrvhkicqeekekrpdlyalamgysgql 

ksrksymipenefhhkdppprnavasvqngpgg 

gpssssigsasksdessteetdksrersqcgvkav 

nkassttpkgnssngnsgsnsnkavkendkekg 

kekekekkektpattpearvlgkdgkekpkeer 

pnkdekaretkertpksdkekekjpbckeekakde 

kfkttvpnaeskstqererekepsrerdiakemk 

skenvkggektpvsgslkspvprsdipepereqkr 

rkidthpspshsstvkdslielkessaklyinhtpp 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanlne C=Cystelne, I>=Aspartic Acid, 
JL=oiutamic Acid, l<-rlienylalanine, G=G!ycinc» H^^Histidine, 
I^Isoleucine, K^Lysine, Ls^Lcucine, M=Methionine, 
N=AsparagincP^Proiine, 0=Glutamine, R=Arginine, S=Serinc 
T=Threonine, V-Valine, W=Tryptopban, Y-Tyrosine, 
X^Unknown, *'=Stop codon, /=possible nucleotide deletion, 
\F=posdbIe nucleotide insertion 










PLSK5KEREMDKKDLDKSRERSREREKKDEKDR 

KERKRDHSNNDREVPPDLTKRRKEENGTMGVSK 

HKSESPCESPYPNEKDKEKNKSKSSGKEKGSDSF 

KSEKMDKISSGGKKESRHDKEKIEKKEKRDSSGG 

KEEKKHHKSSDKHR 


3094 


A 


2 


891 


AMLGTFUBPSKRGAGAVQAEVSERLAMAGPQQQ 

PPYLHLAELTASQFLEIWKHFDADGNGYIEGKEL 

ENFFQELEKARKGSGMMSKSDNFGEKMKEFMQ 

KYDKNSDGKIEMAELAQILPTEENFLLCFRQHVG 

SSAEFMEAWRKYDTDRSGYIEANELKGFLSDLL 

KKANRPYDEPKLQEYTQTILRMFDLNGDGKLGL 

SEMSRLLPVQENFLLKFQGMKLTSEEFNAIFTFY 

DKDRSGYIDEHELDALLKI)LYEKNKKEMNIQQL 

TmTO:SVMSLAEAGKLYRKDLEIVLCSEPPM 


3095 


A 


1685 


700 


RRPTGRPGALGAPAAGRVGMPLHVKWPFPAVPP 

LTWILASSVVMGLVGTYSCFWTKYMNHLTVHN 

REVLYELIEKRGPATPLITVSNHQSCMDDPHLWG 

ILKLRHIWNliCLMRWTPAAADICFTKELHSHFFS 

LGKCVPVCRGAEFFQAENEGKGVLDTGRHMPG 

AGKRREKGDGVYQKGMDFILEKLNHGDWVHIF 

PEGKVNMSSEFLRFKWGIGRLIAECHLNPIILPLW 

HVGMNDVLPNSPPYFPRFGQKITVLIGKPFSALP 

VLERLRAENKSAVEMRKALTDFIQEEFQHLKTQ 

AEQLHNHLQAWEIGLACCLLDSWPAQSWG 


3096 


A 


6642 


4022 


FVPGLREPQWEPAQPSATMSAPSEHEEYARLVM 
EAQPEWLRAEVKRLSHELAETTREKIQAAEYGL 

avleekhqlklqfeelevdyeairsemeqlkeaf 

gqahtnhkkvAadgesreesliqesasbceqyyv 

rkvlelqtelkqlrnvltntqsenerlasvaqe 

lkeinqnveiqrgrlrddikeykfrearllqdys 

eleeenislqkqvsvlrqnqvefeglkheikrle 

eeieylnsqledairlkeiserqleealetlkter 

eqknslrkelshymsindsfytshlhvsldglkf 

sddaaepnndaealvngfehgglaklpldnkts 

tpkkeglappspslvsdllselniseiqkxkqqlm 

qmerekagllatlqdtqkqlehtrgslseqqek 

vtrltenlsalrklqaskerqtaldnekdrdsh 

edgdyyevdingpeblackyhvavaeagelreq 

lkalrstheareaqhaeekgryeaegqaltekv 

sllekasrqdrellarlekelkkvsdvagetqg 

slsvaqdelvtfseelanlyhhvcmcnnetpnr 

vmldyyregqggagrtspggrtspeargrrspi 

llpkgllapeagradggtgdsspspgsslpsplsd 

prrepmniynlianrdqikhlqaavdrttelsrq 

riasqelgpavdkdkealmeeelklksllsikre 

QITTLRTVLKANKQTAEVALANLKSKYENEKAM 

vtetmmklrnelkalkedaatfsslramfatrc 

deyitqldemqrqlaaaedekktlnsllrmaiq 

qklaltqrlelleldheqtrrgrakaapktbcpa 

1^5 v&H 1 ^-ACAbDRAEGTGLANQVFCSEKHSIYC 
D 


3097 


A 




879 

] 
( 


mvkwpatrgnlprsqltgthqhcqprepkita 
serlrrrpratarlrahaappepplavfappsdr 
kellalpvacdpviasvmswvqaasuqgpgdk 
3dvfdeeadesllaqrewqsnmqrrvkegyrd 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue or 
peptide 
sequence 


Amino acid sequence (A^AIanine OCysteine, D^'Aspartic Add, 
£=Glutaroic Add, F^Phenylalanine, G=Glycine, H^Histidine, 
I=lsoIeucine, K=Lysinc, L^Leucine, M=Methionine, 
N^Asparagine, P=ProIine, Q=<?lutamine, R^Arginine, S=^rine, 
T=Threoninc, V=Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, A>possible nudeotide ddetion, 
V=possible nucleotide insertion 










GIDAGKAVTLQQGFNQGYKKGAEVILNYGRLRG 
TLSALLSWCHLHN>mSTLINKINNLLDAVGQCEE 
YVLKHLKSITPPSHWDLLDSIEDMDLCHVVPAE 
KKIDEAKDERLCENNAEFNKNCSKSHSGIDCSYV 
ECCRTQEHAHSGKPKPHMDFGTDSQF 


3098 


A 


2 


505 


GAATLLRSASSAARKAAEAEQVWLHLHRYLSA 

DRRVLGLREWGRPASERECSLCQRLKRELNMGD 

VEKGKKIFIMKCSQCHTVEKGGKHKTGPNLHGL 

FGRKTGQAPGYSYTAAlSKNKGnWGEDTLMEYL 

E^^>KKYIPGTKMIFYGIKKKEERADLIAYLKK^ 

NE 


3099 


A 


144 


1386 


WAVGQARSFPSHPRMSSWIWSRRWSPSVALRVT 

CTSTSSQRWTVLALSKPGSQQQVSMHTPAPGPPT 

AGHTEPPSEPPRRARVAKYRAKFDPRVTAKYDIK 

ALIGRGSFSRVVRVEHRATHQPYAIKMIETKYRE 

GREVCESELRVLRRVRHANnQLVEVFETQERVY 

MVMELATGGELFDRIIAKGSFTEKDATRVLQMV 

LDGVRYLHALGITHRDLKPENLLYYHPGTDSKIII 

TDFGLASARKKGDDCLMKTTCGTPEYIAPEVLV 

RKPYTNSVDMWALGVIAYILLSGTMPFEDDNRT 

RLYRQILRGKYSYSGEPWPSVSNLAKDFIDRLLT 

VDPGARMTALQALRHPWWSMAASSSMKNLHR 

SISQNLLKRASSRCQSTKSAQSTRSSRSTRSNKSR 

RVRERELREL 


3100 


A , 


3 


1500 


ARWNGRWVQVPAWPGPGCGTNASGERQRQLPR 
AWRPVGRTLGSEPIALAWSPPLYLFPIPLPSWAVS 
^P^tKdTMFABBDYDIEEDKLGff^ 

kSaqnligisigggaqycpclyivqvfdntpaal 

dgtvaagdeitgvngrsikgktkvevakiyllqev 

kgevtnm^qadpkqgmsldivlkkvkhrlv 

enmssgtadalglsrailcndglvkrleelerta 

elykgmtehitcnllrafyelsqthrgngipqsc 

AFGDVFSVIGVREPQPAASEAFVKFADAHRSIEK 

FGIRLLKTIKPMLTDLNTYLNKAIPDTRLTIKKYL 

DVKFEYLSYCLKVKEMDDEEYSCIALGEPLYRV 

STGNYEYRLILRCRQEARAIU^SQMRKDVLEKME 

LLDQKHVQDIVFQLQRLVSTMSKYYNDCYAVLR 

DADVFPffiVDLAHTTLAYGLNQEEFTDGEEEEEE 

EDTAAGEPSRDTRGAAGPLDKGGSWCDS 


3101 . 


A 


1173 


197 


QGMDSKQQCVKLNDGHFMPVLGFGTYAPPEVP 

RSKALEVTKLAIEAGFRHIDSAHLYNNEEQVGLA 

IRSKIADGSVKREDIFYTSKLWSTFHRPELVRPAL 

ENSLKKAQLDYVDLYLIHSPMSLKPGEELSPTDE 

NGKVIFDIVDLCITWEAMEKCKDAGLAKSIGVS 

NFNRRQLEMILNKPGLKYKPVCNQVECHPYFNR 

SKLLDFCKSKDIVLVAYSALGSQRDKRWVDPNS 

PVLLEDPVLCALAKKHKRTPALIALRYQLQRGV 

WLAKSYNEQRIRQNVQVFEFQLTAEDMKAIDG 

LDRNLHYFNSDSFASHPNYPYSDEY 


3102 


A 


144 


1098 


EQPRPPPCGRRPLPLGSAPCRVRLGRAPRQAPAM 

SMLPSFGFTQEQVACVCEVLQQGGNLERLGRFL 

WSLPACDHLHKNESVLKAKAWAFHRGNFREL 

YOLESHQFSPPINHPKLQQLWLKAHYVEAEKLR 

GRPLGAVGKYRVRQKFPLPRTIWDGEETSYCFK 

EKSRGVLREWYAHNPYPSPREKRELAEATGLTT 
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SEQID 
NO: 


Method 


Predicted 

banning 

nucleotide 

location 

corresponding 
I to first amino 
1 acid residue of 

peptide 
1 sequence 


Predicted end 
nudeotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Ammo acid sequence (A-AIamne C=CysteiDe, D=Aspartic Add; 
£=Glutamic Acid, F»Pheoylalanine, G«=Glydne, H-Histidlne, 
I"lsoleudfle, K=Lysine, Ls=Leucine, M=Metliioiiinc, 
N«Asparaginc, P=Proline, Q=GIutamlne, R=Arginine, S=Serine, 
T^Threonine, V=Vaiinc, W=Tryptophan, Y«Tyrosinc, 
X-Unknown, *=^top codon, A=possible nudeotide ddetioo, 

V^BtinCciblp nnplFfltirip ine^r^nn 

r~|«v9aiwic uimcuuiic luscraon 










TQVSNWFKN^QRDRAAEAKERENTENNNSSSN 
KQNQLSPLEGGKPLMSSSEEEFSPPQSPDQNSVLL 
LQGNMGHARSSNYSLPGLTASQPSHGLQTHQHQ 
LQDSLLGPLTSSLVDLGS 


3103 


A 


111 

1 227 


1582 


LWSWGCHIMADNDTDRNQ'rEKLLKRVRELEQ 

EVQRLKKEQAKNKEDSNIRENSSGAGKTKRAFD 

FSAHGRRHVALRIAYMGWGYQGFASQENTNNTI 

EEKLFEALTKTRLVESRQTSNYHRCGRTDKGVS 

AFGQVISLDLRSQFPRGRDSEDFNVKEEANAAAE 

EIRYTHILNRVLPPDIRILAWAPVEPSFSARFSCLE 

RTYRYFFPRADLDIVTMDYAAQKYVGTEroFRNL 

CKMDVANGVINFQRTILSAQVQLVGQSPGEGRW 

QEPFQLCQFEVTGQAFLYHQVRCMMAILFLIGQ 

GMEKPEIIDELLNIEKNPQKPQYSMAVEFPLVLY 

DCKFENVKWIYDQEAQEFNITHLQQLWANHAV 

KTHMLYSMLQGLDTVPVPCGIGPKMDGMTEWG 

NVKPSVIKQTSAFVEGVKlVniTYKPLMDRPKro^ 

LESRIQHFVRRGRIEHPItt,FHEEETKAKRDCNDT 

LEEDNTNLETPTKRVCVDTEIKSII 


3104 


A 




1519 


VTLIKMNAMLETPELPAVFDGVKLAAVAAVLYV 
IVRCLNLKSPTAPPDLYFQDSGLSRFLLKSCPLLT 
KEYIPPLIWGKSGfflQTALYGKMGRVRSPHPYGH 
RKFITMSDGATSTFDLFEPLAEHCVGDDITMVICP 
GIANHSEKQYIRTFVDYAQKNGYRCAVLNHLGA 
LPNIELTSPRMFTYGCTWEFGAMVNYIKKTYPLT 
QLVWGFSLGGNIVCKYLGETQANQEKVLCCVS 
\ VCQGYSALRAQETFMQWDQCRRFYNFLMADN 
MKKIILSHRQALFGDHVKKPQSLEDTDLSRLYTA 
TSLMQIDDNVMRKFHGYNSLKEYYEEESCMRYL 
HRIYWLMLVNAADDPLVHESLLTIPKSLSEKRE 
^^S^V[FVLPLHGGHLGFFEGS\^:FPEPLTW]Vro 
VEYANAICQWERNKLQCSDTEQVEADLE 


3105 


A 


1 


1251 


MGLLLMILASAVLGSFLTLLAQFFLLYRRQPEPP 

ADEAARAGEGFRYIKPVPGLLLREYLYGGGRDE 

EPSGAAPEGGATPTAAPETPAPPTRETCYFLNATI 

LFLFRELRDTALTRRWTKKDCVEFEELLQTKTA 

GRLLEGLSLRDVFLGETVPFIKTIRLVRPVVPSAT 

GEPDGPEGEALPAACPEELAFEAEVEYNGGFHLA 

IDVDLVFGKSAYLFVKLSRWGRLRLVFTRVPFT 

HWFFSFVEDPLIDFEVRSQFEGRPMPQLTSIIVNQ 

LKKIIKRKHTLPNYKIRFKPFFPYQTLQGFEEDEE 

HlfflQQWALTEGRLKVTLLECSRLLIFGSYDREA 

NVHCTLELSSSVWEEKQRSSIKTGTISLTAVFMG 

WHRVSEAFPGLWYKLLVDLPFWGLEDGGPLLT 

VPLRQCPG 


3106 


A 1 


972 


468 


MAAAGAGRLRRVASALLLKSPRLPARELSAPAR 

LYHKKWDHYENPRNVGSLDKTSKNVGTGLVG 

APACGDVMKLQIQVDEKGKIVDARFKTFGCGSA 

lASSSLATEWVKGKTVEEALTIKNTDIAKELCLPP 

VKLHCSMLAEDAIKAALADYKLKQEPKKGEAE 
KK 


3107 


A 


106 


1221 

J 
1 


rCQDVRSVFSLVRAMFGEESJAGAGWHREEDM 
RKELQLSLSVTLLLVCGFLYQFTLKSSCLFCLPSF 
KSHQGLEALLSHRRGIVFLETSERMEPPHLVSCS 
^ESAAKIYPEWPVVFFMKGLTDSTPMPSNSTYPA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Grst amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^^Alanine C=Cysteine, D^Aspartic Add, 
E^lutamic Add, F=Phenylalanine, (^=Glydne, H-Histidine, 
I=IsoIeucine, K^Lysine, L^Leucine, M=Methionine, 
^^=Asparagine, P=Prolinc, Q=Glutamine, R=Arginine, S^erine, 
T=Thrconine, V=^Valinc, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *"Stop codon, possible nudeotide deletion, 
\==possible nucleotide Insertion 










FSFLSAmNVFLFPLDMBCRLLEDTPLFSWYNQINA 

SAERNWLfflSSDASRLAIIWKYGGIYMDTDVlSIR 

PIPEENFLAAQASRYSSNGIFGFLPHHPFLWECME 

NFVEHYNSAIWGNQGPELMTRMLRVWCKJLEDF 

QEVSDLRCLNISFLHPQRFYPISYREWRRYYEVW 

DTEPSFNVSYALHLWNHMNQEGRAVIRGSNTLV 

ENLYRKHCPRTYRDLIKGPEGSVTGELGPGNK 


3108 


A 


1612 


839 


EVALFCTEMAAGMYLEHYLDSIENLPFELQRNFQ 

LMRDLDQRTEDLKAEIDKLATEYMSSARSLSSEE 

KLALLKQIQEAYGKCKEFGDDKVQLAMQTYEM 

VDKHIRRLDTDLARFEADLKEKQIESSDYDSSSS 

KGKKKGRTQKEKKAARARSKGKNSDEEAPKTA 

QKKLKLVRTSPEYGMPSVTFGSVHPSDVLDMPV 

DPNEPTYCLCHQVSYGEMIGCDNPDCSIEWFHFA 

CVGLTTKPRGKWFCPRCSQERKKK 


3109 


A 


1 


2613 


MVAVRAAGPREGASQDEAGTVWAPMTGCPCQC 
RPGPSWLLVDTLEPETAYPVQRPGPEQAGNQRL . 
QMKRAQFGPHDWLSLPVPPGPSWLLVDTLEPET 
AYQFSVLAQNKLGTSAFSEVVTVNTLAFPITTPEP 
LVLVTPPRCLIANRTQQGVLLSWLPPANHSFPIDR 
YIMEFRVAERWELLDDGIPGTEGEFFAKDLSQDT 
WYEFRVLAVMQDLISEPSNIAGVSSTDIFPQPDLT 
EDGLARPVLAGIVATICFLAAAILFSTLAACFVNK 
QRKRKLKRKKDPPLSITHCRKSLESPLSSGKVSPE 
SIRTLRAPSESSDDQGQPAAKRMLSPTREKELSL 
YKKTKRAISSKKYSVAKAEAEAEATTPIELISRGP 
• DGRFVMDPAEl^PSlQKSRRffiGFPFAEETbl^ ' 
FRQSDEENEDPLVPTSVAALKSQLTPLSSSQESYL 
PPPAYSPRFQPRGLEGPGGLEGRLQATGQARPPA 
PRPFHHGQYYGYLSSSSPGEVEPPPFYVPEVGSPL 
SSVMSSPPLPTEGPFGHPTIPEENGENASNSTLPLT 
QTPTGGRSPEPWGRPEFPFGGLETPAMMFPHQLP 
PCDVPESLQPKAGLPRGLPPTSLQVPAAYPGILSL 
EAPKGWAGKSPGRGPVPAPPAAKWQDRPMQPL 
VSQGQLRHTSQGMGIPVLPYPEPAEPGAHGGPST 
FGLDTRWYEPQPRPRPSPRQARRAEPSLHQWLQ 
PSRLSPLTQSPLSSRTGSPELAARARPRPGLLQQA 
EMSEITLQPPAAVSFSRKSTPSTGSPSQSSRSGSPS 
YRPAMGFTTLATGYPSPPPGPAPAGPGDSLDVFG 
QTPSPRRTGEELLRPETPPPTLPTLGKLRRDRPAP 
ATSPPERALSKL 


3110 


A 


88 


924 


ILGSRTMSLTNTKTGFSVKDILDLPDTNDEEGSV 

AEGPEEENEGPEPAKRAGPLGQGALDAVQSLPL 

KNPFYDSSDNPYTRWLASIEGLQYSLHGLAAGA 

PPQDSSSKSPEPSADESPDNDKETPGGGGDAGKK 

RKRRVLFSKAQTYELERRFRQQRYLSAPEREHLA 

SLIRLTPTQVKIWFQNHRYKMKRARAEKGMEVT 

PLPSPRRVAVPVLVRDGKPCHALKAQDLAAATF 

QAGIPFSAYSAQSLQHMQYNAQYSSASTPQYPT 

AHPLVQAQQWTW 


3111 


A 


595 


291 . 


PSVASLARRFSGRALWPPSHSVPGNRALCPRLLH 
GTTLPGGNQRELARQKNMKKQSDSVKGKRRDD 
GLSAAARKQRDSTPRDSEIMQQKQKKANEKKEE 
PK 


3112 


A 


3641 


1555 


APMLQIHHFSFKLIFQNIHKSKFISQRLSQNADST 
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SEQID 
NO: 


1 Method 


1 Predicted 
beginning 

1 nucleotide 
location 
corresponding 
to first amino 
add residue of 
peptide 

1 sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue ol 
peptide 
sequence 


Amino acid sequence (A-Alanine C^Cysteinc, D=Aspartlc Acid, 
Je^ivintarolc Acid. F=:Piienviaia ninA c—/'^u,^:.>. t« »»• 

wiHioKuik nuu, r I'ucnyiaianine, u=vviycine, H^Histidine, 

I-Isoleucine, K-Lysine, L«=Leucine, M=Methioninc 
N=A^paragine, P^ProIine, Q=Glutamine, R=Argi„inc S=Serine, 
T-7Tiiwmnc,V=Valine,W==Ti7ptoplian.Y-Tyrosine, 
X»Unknown, *«Stop codon, y^possible nucleotide deletion, 
^r=possibIe nucleotide insertion 










RHT^«LSN^11YSDLIVWNCCLFFRNWCNEFFLKS ~ 

CHFAQEREGSGDLCNSRAEKTKSAACVIFRRFPV 

APLIPYPLITKEDINAffiMEEDKRDLISREISKFRDT 

HKKLEEEKGKKEKERQEIEKERRERERERERERE 

RREREREREREREREKEKERERERERDRDRDRTK 

ERDRDRDRERDRDRDRERSSDRNKDRSRSREKS 

RDRERERERERERERERERERERERERERERERE 

REREKDKKRDREEDEEDAYERRKLERKLREKEA 

AYQERLKNWEIRERKKTREYEKEAEREEERRRE 

MAKEAKRLKEFLEDYDDDRDDPKYYRGSALOK 

RLRDREKEMEADERDRKREKEELEEIRQRLLAE 

GHPDPDAELQRMEQEAERRRQPQIKQEPESEEEE 

EEKQEKEEKREEPMEEEEEPEQKPCLKPTLRPISS 

APSVSSASGNATFNTPGDESPCGIIIPHENSPDQQ 

QPEEHRPKIGLSLKLGASNSPGQPNSVKRKKLPV 

DSVFNKFEDEDSDDVPRKRKLVPLDYGEDDKNA 

IXGTVNTEEKRKHIKSLIEKIPTAKPELFAYPLDW 

SIVDSILMERRIRPWINKKIIEYIGEEEATLVDLVC 

SKVMAHSPPQSILDDVAMVLDEEAEVFIVKMWR 

LLIYETEAKKIGLVK 


3113 


A 


1 


669 


VCAGIRDPCSTPLAKPAAGUAENLSFGKQPGLET 
NILKMTTPNKTPPGADPKQLERTGTVREIGSQAV 
WSLSSCKPGFGVDQLRDDNLETYWQSDGSQPHL 
VNIQFRRKTTVKTLCIYADYKSDESYTPSKISVRV 
GfNNFHM.QEIRQLELVEPSGWIHVPLTDNHKKPT 
RTFMIQIAVLANHQNGRDTHMRQIKIYTPVEESSI 
GKFPRGTTQ5FMiviYRSIR 


3114 1 


A 


1 


1613 


MTSKJEESRRQQPTAGPAGQGKLPSPSEPQLPTPP 

TRSLHHFRRPLSPSREAQAHIAPSSELHLPQSQSA 

GPPPLGAGTEVELWPGRDEGSRGALPGSSGVKF 

VWRKIVRFPVSDQVRTLSISRLMRRLLEMMQTL 

VQFnGWRSLLGRTLGTIMNTMYVMMAQILRSH 

LIKATVIPNRVKMLPYFGIIRNRMMSTHKSKKKI 

REYYRLLNVEEGCSADEVRESFHKLAKQYHPDS 

GSNTADSATFIRIEKAYRKVLSHVIEQTNASQSK 

GEEEEDVEKFKYKTPQHRHYLSFEGIGFGTPTQR 

EKHYRQFRADRAAEQVMEYQKQKLQSQYFPDS 

VIVKNIRQSKQQKITQAIERLVEDLIQESMAKGDF 

D>a,SGKGKPLKKFSDCSYIDPMTH>}LNRILIDNG 

YQPEWILKQKEISDTIEQLREAn.VSRKKLGNPMT 

PTEKKQWNHVCEQFQENIRKLNKRINDFNLrVPI 

LTRQKVHFDAQKEIVRAQKIYETLIKTKEVTDRN 

PNNLDQGEGEKTPEIKKGFLNLMDLVEIY 


3115 


A 


1 


2036 

] 

: 

] 
I 
I 
I 


FRHRCGCLSYCRSRRGIRRVEPLRRARARVGPRF 

RPLCRMEIIRSNFKSNLHKVYQAIEEADFFAIDGE 

FSGISDGPSVSALTNGFDTPEERYQKLKKHSMDF 

LLFQFGLCTFKYDYTDSKYITKSFNFYVFPKPFNR 

SSPDVKFVCQSSSIDFLASQQFDFNKGFRKGIPYL 

WQEEERQLREQYDEKRSQANGAGALSYVSPNTS 

CCPVnPEDOKK'FTnnwpii^'rBnT t no-ccKTr^TT t^t 
* lixcri/v^isjsjii^y V viiJsaJ&JyJL/LQoJEENKNLDL 

3PCTGFQRKLIYQTLSWKYPKGIHVETLETEKKE 

lYIVISKVDEEERKRREQQKHAKEQEELNDAVG 

<SRVIHAIANSGKLVIGHNMLLDVMHTVHQFYC 

'LPADLSEFKEMTTCVFPRLLDTKLMASTQPFKD 

INNTSLAELEKRLKETPFNPPKVESAEGFPSYDT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Grst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine OCysteioe, I>=Aspartic Acid, 
£=Glutamic Acid, F=Phenylalaninc, G=Glycine, H-Histidine, 
iBlsoleucine, K^Lysinc, Lr^Leucine, K^=Methionioe, 
N»Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T==Threonine, V= Valine, W=Tryptophan, Y=Tyrosinc, 
X=>Unkoown, *=Stop codon, /possible nucleotide deletion, 
\~possible nucleotide insertion 










ASEQLHEAGYDAYITGLCFISMANYLGSFLSPPKI 

HVSARSKLffiPFTim-FLMRVMDIPYLNLEGPDL 

QPKRDHVLHVTFPKEWKTSDLYQLFSAFGNIQIS 

WIDDTSAFVSLSQPEQVKIAVNTSKYAESYRIQT 

YAEYMGRKQEEKQIKRKWTEDSWKEADSKRLN 

PQCIPYTLQNHYYRNNSFTAPSTVGKRNLSPSQE 

EAGLEDGVSGEISDTELEQTDSCAEPLSEGRKKA 

KKLKRMKKELSPAGSISKNSPATLFEVPDTW 


3116 


A 


3 


1443 


TE^APMALAVAPWGRQWEEARALGRAVRMLQ 

RLEEQCVDPRLSVSPPSLRDLLPRTAQLLREVAH 

SRRAAGGGGPGGPGGSGDFLLIYLANLEAKSRQ 

VAALLPPRGRRSANDELFRAGSRLRRQLAKLAn 

FSHMHAELHALFPGGKYCGHMYQLTKAPAHTF 

WRESCGARCVLPWAEFESLLGTCHPVEPGCTAL 

ALRTTIDLTCSGHVSIFEFDVFTRLFQPWPTLLKN 

WQLLAVNHPGYMAFLTYDEVQERLQACRDKPG 

SYIFRPSCTRLGQWAIGYVSSDGSILQTIPANKPLS 

QVLLEGQKDGFYLYPDGKTHNPDLTELGQAEPQ 

QRIHVSEEQLQLYWAMDSTFELCKICAESNKDV 

KIEPCGHLLCSCCLAAWQHSDSQTCPFCRCEIKG 

WEAVSIYQFHGQATAEDSGNSSDQEGRELELGQ 

VPLSAPPLPPRPDLPPRKPRNAQPKVRLLKGNSPP 

AALGPQDPAPA 


3117 


A 


296 


3547 ' 


ERHSSPLLQHILTHALMRNKKHSNNWLAQHWF 
QSSIDLCFSPVGRTLRVRARKFPAIVNCTAIDWFH 
AWPQEALVSVSRRFIEETKGIEPVHKDSISLFMAH 
•J\^tT^O»JEMSTRYYQlflE 
Ki^LtCKKQT^VSEK^^ 

DLKARLASQEAELQLR14HDAEALITKIGLQTEKV 

SREKTIADAEERKVTAJQTEVFQKQRECEADLLK 

AEPALVAATAALNTLNRVNLSELKAFPNPPIAVT 

NVTAAVMVLLAPRGRVPBCDRSWKAAKVFMGK 

VDDFLQALINYDKEHIPENCLKWNEHYLKDPEF 

NPNLIRTKSFAAAGLCAWVINIIKFYEVYCDVEP 

KRQALAQANLELAAATEKLEAIRKKLVVSANYD 

lEKSEKIRWGQSIKSFEAQEKTLCGDVLLTAAFVS 

YVGPFTRQYRQELVHCKWVPFLQQKVSIPLTEG 

LDLISMLTDDATIAAWNNEGLPSDRMSTENAAIL 

THOERWPLVIDPQQQGIKWIKl^YGMDLKVTHL 

GQKGFLNAIETALAFGDVILIENLEETIDPVLDPL 

LGRiraKKGKYIRIGDKECEFNKNFia.ILHTKLAN 

PHYKPELQAQTTLLNFTVTEDGLEAQLLAEVVSI 

ERPDLEKLKLVLTKHQNDFKIELKYLEDDLLLRL 

SAAEGSFLDDTKLVERLEATKTTVAEIEHKYIEA 

KENERKIhmARECYRPVAARASLLYFVINDLQKI 

NPLYQFSLKAFNVLFHRAIEQADKVEDMQGRISI 

LMESITHAVFLYTSQALFEKDKLTFLSQMAFQIL 

LRKKEIDPLELDFLLRFTVEHTHLSPVDFLTSQSW 

SAIKAIAVMEEFRGIDRDVEGSAKQWRKWVESE 

CPEKEKLPQEWKKKSLIQKLE-LRAMRPDRMTY 

ALRNFVEEKLGAKYVERTRLDLVKAFEESSPATP 

IFFILSPGVDALKDLEILGKRLGFnDSGKFHNVSL 

GQGQETVAEVALEKASKGGHWVILQNVHLVAK 

WLGTLEKLLERFSQGSHRDYRVFMSAESAPTPD 

EHIIPQGLLENSIKITNEPPTGMLANLHAALYNFD 
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SKKiiD I Method 
NOi 



3118 TA 



3119 [A 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to iirst amino 
add residue of 
peptide 



1254 



Predicted end 
nueieotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 



226 



4133 



3120 TA 



3121 Ta 



43 



Amino add sequence (A=Alanine C=Cystdne, D^Aspartie Add," 
E=Glntamic Acid, F=Plienylalanlne, &K:iydne, H>4Bstidine, 
I^Isoleudne, K°Lyslne, L-Leudne, M=Mefliionine, 
N=Asparagioe, P*.ProUne, Q=Glutamine, R=Ai?inine, S=Serine. 
T^Threonine, V=Valine, W°Tryptophan, Y=Tyrosine, 
X=Unlinown, *°5top codon, A^possible nucleotide deletion, 
V=possible nudeotide insertion 



PYSLSTSCLGSPTSPRLEMDPNCiiCATGGSCTCTG 

SCKCKECKCNSCKKSECGAISRNLGLSQVRGRKP 
ELGMEE 



1004 



PLATLTMEHQGHSEMEUPSESHPfflQLLKSNREL 

LVTHIRNTQCLVDNLLKNDYFSAEDAEIVCACPT 

QPDKVRKILDLVQSKGEEVSEFFLYLLQQLADAY 

VDLRPWLLBIGFSPSLLTQSKVWNTDPVSRYTQ 

QLRHHLGRDSKFVLCYAQKEELIXEEIYMDTIME 

LVGFSNESLGSLNSLACLLDHTTOILNEQGETIFIL 

GDAGVGKSMLLQRLQSLWATGRLDAGVKFFFH 

FRCRMFSCFKESDRLCLQDLLFKHYCYPERDPEE 

VFAFLLRFPHVALFTFDGLDELHSDLDLSRVPDS 

SCPWEPAHPLVLLANLLSGKLLKGASKLLTART 

GIEVPRQFLRKKVLLRGFSPSHLRAYARRMFPER 

ALQDRLLSQLEANPNLCSLCSVPLFCWnFRCFQH 

FRAAFEGSPQLPDCTMTLTDVFLLVTEVHLNRM 

QPSSLVQRNTRSPVETLHAGRDTLCSLGQVAHR 

GMEKSLFVFTQEEVQASGLQERDMQLGFLRALP 

ELGPGGDQQSYEFFHLTLQAFFTAFFLVLDDRVG 

TQELLRFFQEWMPPAGAATTSCYPPFLPFQCLQG 

SGPAREDLFKNKDHFQFTNLFLCGLLSKAKQKLL 

RHLVPAAALRRKRKALWAHLFSSLRGYLNSLPR 

VQVESFNQVQAMPTFTWMURCIYETQSQKVGQL 

AARGICAJTiXKLTYCNACSADCSALSFVLHHFP 

KRLALDLDNNNLNDYGVRELQPCFSRLTVLRLS 

VNQITDGGVKVLSEELTKYKIVTYLGLYNNQITD 

VGARYVTKILDECKGLIHLKLGKNKITSEGGKY 

LALAVKNSKSISEVGMWGNQVGDEGAKAFAEA 

LRNHPSLTTLSLASNGISTEGGKSLARALQQOTSL 

EILWLTQNELNDEVAESLAEMLKVNQTLKHLWL 

IQNQITAKGTAQLADALQSNTGITEICLNGNLIKP 

EEAKVYEDEKRHCF 



1490 



QLWGl-AAGSDSRPAMGCDGGTIPKRHELVKGPK 

KVEKVDKDAELVAQWNYCTLSQEILRRPIVACE 

LGRLYNKDAVIEFLLDKSAEKALGKAASHIKSIK 

NVTELKLSDNPAWEGDKGNTKGDKHDDLQRAR 

FICPWGLEMNGRHRFCFLRCCGCVFSERALKEI 

KAEVCHTCGAAFQEDDVIVLNGTKEDVDVLKTR 

MEERRLRAKLEKKTKKPKAAESVSKPDVSEEAP 

GPSKVKTGKPEEASLDSREBCKTNLAPKSTAMNE 

SSSGKAGKPPCGATKRSIADSEESEAYKSLFTTHS 

SAKRSKEESAHWVTHTSYCF 



HASGPTRPVSWSFHKLKTMKHLLLLLLCVFLVK 

SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 

PPISGGGYRARPAKAAATQKKVERKAPDAGGCL 

HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 

NNNVEAVSQTSSSSFQYMYLLKDLWQKRQKQV 

KDNENVVNEYSSELEKHQLYIDETVNSNIPTOLR 

VLRSILENLRSKIQKLESDVSAQMEYCRTPCTVS 

CNIPWSGKECEEIIRKGGETSEMYLIQPDSSVKP 

YRVYCDMNTENGGWTVIQNRQDGSVDFGRKW 

DPYKQGFGNVATNTDGKNYCGLPGEYWLGNDK 

ISQLTRMGPTELLIEMEDWKGDKVKAHYGGFTV 

QNEANKYQISVNKYRGTAGNALMDGASQLMGE 
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wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A'^Alanine C=Cysteine, D^Aspartic Add, 
E=Glutamic Add, F'^Phenylalanine, G'^GIycine, H^^Histidine, 
I=Isoleucine, K^Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=Proline, Q==Giutamine, R=Arginine, S=Serine, 
T=Tiireoninc, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possibIe nudeotide insertion 










NRTMTIHNGMFFSTYDRDNDGWLTSDPRKQCSK 
EDGGGWWYNRCHAAIsfPNGRYYWGGQYTWDM 
AKHGTDDGWWMNWKGSWSMKmSMKIRP 
FFPQQ 


3122 


A 


3 


1490 


HASGPTRPVSWSFHKLKTMKHLLLLLLCVFLVK 

SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 

PPISGGGYRARPAKAAATQKKVERKAPDAGGCL 

HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 

^I^^^^V^VSQTSSSSFQYMYLLKDLWQKRQKQV 

KDNENV\^YSSELEKHQLYroETVNSNIPTNLR 

VLRSILENLRSKIQKLESDVSAQMEYCRTPCTVS 

CNIPWSGKECEEIIRKGGETSEMYLIQPDSSVKP 

YRVYCDMNTENGGWTVIQNRQDGSVDFGRKW 

DPYKOGFGNVATNTDGKNYCGLPGFYWT fiKDlC 

ISQLTRMGPTELLIEMEDWKGDKVKAHYGGFTV 

QNEANKYQISVNKYRGTAGNALMDGASQLMGE 

NRTMTIHNGMFFSTYDRDNDGWLTSDPRKQCSK 

EDGGGWWYNRCHAANPNGRYYWGGQYTWDM 

AKHGTDDGVVWMNWKGSWSMKJmSMKIRP 

FFPQQ 


3123 


A 


3 


1490 

i 


HASGPTRPVSWSFHKLKTMKHLLLLLLCVFLVK 
SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 
PPISGGGYRARPAKAAATQBCKVERKAPDAGGCL 
HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 
NNNVEAVSQTSSSSFQYMYLLKDLWQKRQKQV 
KDNENVVNEYSSELEKHQLYIDETVNSNIPTNLR 
^ VLRSILENLRSKIQiiLESDVS AQMEYGRTPCl^S 
ONIPVVSGKECEEriRKGGETSEMYLIQPDSSVKP 
YRVYCDMNTENGGWTVIQNRQDGSVDFGRKW 
DPYKOGFGNVATNTDGKNYPGT PGFYWT GTsmK" 

ISQLTRMGPTELLIEMEDWKGDKVKAHYGGFTV 

QNEANKYQISVNKYRGTAGNALMDGASQLMGE 

NRTMTIHNGMFFSTYDRDNDGWLTSDPRKQCSK 

EDGGGWWYNRCHAANPNGRYYWGGQYTWDM 

AKHGTDDGWWMNWKGSWYSMKKMSMKJRP 

FFPQQ 


3124 


A 


3 


544 


RVDDFVLLRSRLALRWLSHVRRPSRRVPRMPRG 

SRSRTSRMAPPASRAPQMRAAPRPAPVAQPPAA 

APPSAVGSSAAAPRQPGLMAQMATTAAGVAVG 

SAVGHTLGHAITGGFSGGSNAEPARPDITYQEPQ 

GTQPAQQQQPCLYEIKQFLECAQNQGDIKLCEGF 

NEVLKQCRLANGLA 


3125 


A 


3 


571 


GNSYNHRSLAAYPYMSHSQHSPYLQSYHNSSAA 

AQTRGDDTDQQKTTVIENGEnU^GKGKKIRKPR 

TIYSSLQLQALNHRFQQTQYLALPERAELAASLG 

LTQTQVKIWFQNKRSKFKKLLKQGSNPHESDPL 

QGSAALSPRSPALPPVWDVSASAKGVSMPPNSY 

MPGYSHWYSSPHQDTMQRPQMM 


3126 


A 


43 


5377 


LSVFFPIPVDGRDRGSNPSLESTSSELSTSTSEGSL 

SAMSGRNELHSRLHPHPQSSLIPMMFSPPESLLAS 

CILRGNFAEAHQVLFTFNLKSSPSSGELMFMERY 

QEVIQELAQVEHKIENQNSDAGSSTIRRTGSGRST 

LQAIGSAAAAGMVFYSISDVTDKLLNTSGDPIPM 

LQEDFWISTALVEPTAPLREVLEDLSPPAMAAFD 

LACSQCQLWKTCKQLLETAERRLNSSLERRGRRI 
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wo 01/57190 



PCT/USOl/04098 



1 SEQm 
NO: 


1 Method 


1 Predicted 
j beginning 

nucleotide 

location 

corresponding 
1 to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Ammo acid sequence (A-Alaninc C=Cysteine, D=Aspartic Acid, " 
iL-waiamic Acia, r-rnenyialanme, G=Glycine, H=Histicline, 
I=Isoleadne, K=Ly$ine, I^Uucine, M=Methionine, 
N=Aspanigine, PHProline, Q=Glutaiiime, R=Arginine, S^Serioe, 
T^Thraonine, V^Valine, W=Tryptopliaa, Y=Tyrosine, 
X-Unknown, *>6top codon, possible oDdeotide ddelion, 
V*passible nudeotide insertioa 


1 * 1 








DHVLLNADGIRGFPWLQQISKSLNYLLMSASQT 

KSESVEEKGGGPPRCSITELLQMCWPSLSEDCVA 

SHTTLSQQLDQVLQSLREALELPEPRTPPLSSLVE 

QAAQKAPEAEAHPVQIQTQLLQKNLGKQTPSGS 

RQMDYLGTFFSYCSTLAAVliQSLSSEPDHVEVK 

VGNPFVLLQQSSSQLVSHLLFERQVPPERLAALL 

AQENLSLSVPQVIVSCCCEPLALCSSRQSQQTSSL 

LTRLGTLAQLHASHCLDDLPLSTPSSPRTTENPTL 

ERKPYSSPRDSSLPALTSSALAFLKSRSKLLATVA 

CLGASPRLKVSKPSLSWKELRGRREVPLAAEQV 

ARECERLLEQFPLFEAFLLAAWEPLRGSLQQGQS 

LAVNLCGWASLSTVIiGLHSPIALDVLSEAFEES 

LVARDWSRALQLTEVYGRDVDDLSSIKDAVLSC 

AVACDKEGWQYLFPVKDASLRSRLALQFVDRW 

PLESCLEBLAYCISDTAVQEGLKCELQRKLAELQ 

VYQKILGLQSPPVWCDWQTLRSCCVEDPSTVMN 

MILEAQEYELCEEWGCLYPIPREHUSLHQKHLL 

HLLERRDHDKALQLLRRDPDPTMCLEVTEQSLDQ 

HTSLATSHFLANYLTTHFYGQLTAVRHREIQALY 

VGSKILLTLPEQHRASYSHLSSNPLFMLEQLLMN 

MKVDWATVAVQTLQQLLVGQEIGFTMDEVDSL 

LSRYAEKALDFPYPQREKRSDSVIHLQEIVHQAA 

DPETLPRSPSAEFSPAAPPGISSIHSPSLRERSFPPT 

QPSQEFVPPATPPARHQWVPDETESICMVCCREH 

FT^lF^mRHHCRRCGRLVCSSCSTKKMVVEGCRE 

NPARVCDQCYSYCNKDVPEEPSEKPEALDSSKSE 

SPPYSFWRVPKADEVEWILDLKEEENELVRSEF 

YYEQAPSASLCiAlLNLHRDSIACGHQLIEHCCRL 

SKGLTNPEVDAGLLTDIMKQLLFSAKMMFVKAG 

QSQDLALCDSYISKVDVLNILVAAAYRHVPSLDQ 

ILQPAAVTRLRNQLLEAEYYQLGVEVSTKTGLDT 

TGAWHAWGMACLKAGNLTAAREKFSRCLKPPF 

DLNQLNHGSRLVQDVVEYLESTVRPFVSLQDDD 

YFATLRELEATLRTQSLSLAVIPEGKIMNNTYYQ 

ECLFYLHNYSTNLAIISFYVRHSCLREALLHLLNK 

ESPPEVFIEGIFQPSYKSGKLHILENLLESIDPTLES 

WGKYLIAACQHLQKKNYYHILYELQQFMKDQV 

RAAMTCIRFFSHKAKSYTELGEKLSWLLKAKDH 

LKIYLQETSRSSGRKKTTFFRKKMTAADVSRHM 

NTLQLQMEVTRFLHRCESAGTSQITTLPLPTLFG 

^n>m^KMDVACKVMLGGKNVEDGFGIAFRVLQ 

DFQLDAAMTYCRAARQLVEKEKYSEIQQLLKCV 

SESGMAAKSDGDTILLNCLEAFKRIPPQCCFCSA 

QELEGUQAIHNDDNKVRAYLICCKLRSAYLIAV 

KQEHSRATALVQQVQQAAKSSGDAVVQDICAO 

WLLTSHPRGAHGPGSRK 


3127 
[ 3128 1 . 


A 

^ 1 


467 
1854 


1259 

798 1 , 


HLGPPLAWIPAASLTSTKGEFGVEDDRPARGPPP 

PKSEEASWSESGVSSSSQDGPFAGGEVDKRLHQL 

KTQLATLTSSI^TVTQEKSRMEASYLADKKKMK 

yULbUAS>NKAEEERARLEGELKGLQEQIAETKA 

RLITQQHDRAQEQSDHALMLRELQKLLQEERTQ 

RQDLELRLEETREALAGRAYAAEQMEGFELQTK 

QLTREVEELKSELQAIRDEKNQPDPRLQELQEEA 

ARLKSHFQAQLQQEMRKVUmSFKHQPLT 

^SGSPAPSSSSAMAAACGPGAAGYCLLLGLHLFL 
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wo 01/57190 



PCT/USO 1/04098 



SEQW 
NO: 


Method 


Predicted 

beginning 

nucleotide 

tocntion 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
localiDn 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A=Aianine OCysteine, D»Aspartic Add, 
E^lutamic Add, F^Phenylalanine, G=GIydne, H=Histidinei 
I»l5oleudne, K^Lysiae, I>»jLeucine, M'sMetliionine, 
N»Asparagine, P^ProUne, Q^Glutamine, R^rginine, S^erine, 
T»Threonine, V«Valine, W=Tryptoplian, Y«Tyrosine, 
X~Dnknown, *=Stop codon, /=possible nucleotide deletion, 
V^possible nucleotide insertion 










LTAGPALGWNDPDRMLLRDVKALTLHYDRYTT 
SRRLDPIPQLKCVGGTAGCDSYTPKVIQCQNKG 

WDGYDVQWECKTDLDIAYKFGKTWSCEGYES 

SEDQYVLRGSCGLEYNLDYTELGLQKLKESGKQ 

HGFASFSDYYYKWSSADSCNMSGLmVVLLGlA 

FWYKLFLSDGQYSPPPYSEYPPFSHRYQRFTNS 

AGPPPPGFKSEFTGPQNTGHGATSGFGSAFTGQQ 

GYENSGPGFWTGLGTGGILGYLFGSNRAATPFSD 

SWYYPSYPPSYPGTWNRAYSPLHGGSGSYSVCS 

NSDTKTRTASGYGGTRRR 


3129 


A 


2340 


1192 


EIARRPKQQSSEKSIO'MRIWLTIFELFPLKLVEK 

CESSVSLTVPPVVKLENGSSTNVSLTLRPPLNATL 

VITFEITFRSKNITILELPDEWVPPGVTNSSFQVT 

SQNVGQLTVYLHGNHSNQTGPRIRFLVIRSSAISn 

NQVIGWT^AWSISFYPQVIMhmilRKSVIGLSF 

DFVALNLTGFVAYSVFNIGLLWVPYIKEQFLLKY 

PNGVNPVNSNDVFFSLHAWLTLinVQCCLYERG 

GQRVSWPAIGFLVLAWLFAFVTMIVAAVGVITW 

LQFLFCFSYIKLAVTLVKYFPQAYMNFYYKSTEG 

WSIGNVLLDFTGGSFSLLQMFLQSYNNDQWTLIF 

GDPTKFGLGWSIVFDVVFFIQHFCLYRKRPGYD 

QLN 


3130 


A 


31 


2026 


CWWPPLLPQLEPEPPPLRPRVAASQGGGMLGKG 
WGGGGGTKAPKPSFVSYVRPEEIHTNEKEVTEK 
EVTLHLLPGEQLLCEASTVLKYVQEDSCQHGVY 
GRLVCTDFKIAFLGDDESALDNDETQFKNKVIGE 
-hlDITLHGVDQn^GVFDEKKKTLFGQLKKYPEKLII 
HCKDLRWQFCLRYTiaEEEVKRIVSGI^ 
KLLKRLFLFSYATAAQNNTVTDPKNHTVMFDTL 
KDWCWELERTKGNMKYKAVSVNEGYKVCERL 
PAYFWPTPLPEENVQRPQGHGIPIWCWSCHNGS 
ALLKMSALPKEQDDGILQIQKSFLDGIYKTIHRPP 
YEIVKTEDLSSNFLSLQEIQTAYSKFKQLFLIDNST 
EFWDTDIKWFSLIJBSSSWLDIIRRCLKKAIEITEC 
MEAQNMNVLLLEENASDLCCLISSLVQLMMDPH 
CRTRIGFQSLIQKEWVMGGHCFLDRCNHLRQND 
KEEHQRQLSLPLTQSKSSPKRGFFREETDHLIKNL 
LGKRISKLINSSDELQDNFREFYDSWHSKSTDYH 
GLLLPHIEGPEIKVWAQRYLRWIPEAQILGGGQV 
ATLSKLLEMMEEVQSLQEKIDERHHSQQAPQAE 
APCLLKNSARLSSLFPFALLQRHSSKPVLPTSGW 
KALGDEDDLAKREDEFVDLGDV 


3131 


A 


126 


965 


QSRSRPRREGVGTGSRAVLCILATCGSKMSDIGD 

WFRSIPAITRYWFAATVAVPLVGKLGLJSPAYLF 

LWPEAFLYRFQIWRPITATFYFPVGPGTGFLYLV 

NLYFLYQYSTRLETGAFDGRPADYLFMLLFNWI 

CmTGLAMDMQLLMIPLlMSVLYVWAQLNRDM 

IVSFWFGTRFKACYLPWVILGFNYIIGGSVINELIG 

.NLVGHLYFFLMFRYPMDLGGRNFLSTPQFLYRW 

LPSRRGGVSGFGVPPASMRRAADQNGGGGRHN 

WGQGFRLGDQ 


3132 


A 


2 


350 


FVAGWRALTAFSTSARLRAFGWQAAARLLVFG 
ARGVGLGSGAPGSLPCYLRMDALALLGGLVNV 
ARLPERWGPGRFDYWGNSHQIMHLLSVGSILQL 
HAGWPDLLWAAHHACPRD 
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wo 01/57190 



PCT/US01/D4098 



SEQID 
NO: 


Method 


Fredlcted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine 0=Cysteine, D=Aspartic Acid, 
ji^-vivtaiiiiL ALiu, r— rncnyiaiamne, 0=!jtlycinc, H=HisfidiDe, 
t=koleuciDe, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagioe, P=Proline, Q=Glutamine, R=Arginin«, S=Serine, 
T^Tbreonine, V-Valine, W^Tryptophan, Y=Tyroslne, 
X^Unkoown, *=Stop eodoii,/=possible nndeotide ddetioo, 
^FpossiUe nucleotide insertion 


3133 


A 


1 


2921 


MTCFKGQKGEQRSHAFEANKDHKAKVPSPNLYS 

QLNALQFTVDBRSILWLNQFLLDLKQSLNQFMA 

VYKLNDNSKSDEHVDVRVDGLMLKFVIPSEVKS 

ECHQDQPRAISIQSSEMIATOTRHCPNCRHSDLEA 

LFQDFKDCDFFSKTYTSFPKSCDNFNLLHPIFQRH 

AHEQDTKMHEIYKGNITPQLNKNTLKTSAATDV 

WAVYFSQFWBDYEGMKSGKGRPISFVDSFPLSIW 

ICQPTRYAESQKEPQTCNQVSLNTSQSESSDLAG 

RLKRKKLLKEYYSTESEPLTNGGQKPSSSDTFFR 

FSPSSSEADIHLLVHVHKHVSMQINHYQYLLLLF 

LHESLILLSENLRKDVEAVTGSPASQTSICIGEULR 

SAELALLLHPVDQANILKSPVSESVSPWPDYLP 

TENGDFLSSKRKQISRDINRIRSVTVNHMSDNRS 

MSVDLSHIPLKDPLLFKSASDTNLQKGISFMDYL 

SDKHLGKISEDESSGLVYKSGSGEIGSETSDKKDS 

FYTDSSSVLNYREDSNILSFDSDGNQNILSSTLTS 

KGNETIESIFKAEDLLPEAASLSENLDISKEErPPV 

RTLKSQSSLSGKPKERCPPNLAPLCVSYKNMKRS 

SSQMSLDTISLDSMILEEQLLESDGSDSHMFLEKG 

NKKNSTTNYRGTAESVNAGANLQNYGETSPDAI 

STNSEGAQENHDDLMSVWFKITGVNGEIDIRGE 

DTEICLQVNQVTPDQLGNISLRHYLCNRPVGSDQ 

KAVmSKSSPEISLRFESGPGAVfflSLLAEKNGFL 

QCHIENFSTEFLTSSLMNIQHFLEDETVATVMPM 

KIQVSNTKINLKDDSPRSSTVSLEPAPVTVHIDHL 

WERSDDGSFHIRDSHMLNTGNDLKENVKSDSV 

LLTSGKYDLKKQRSVTQATQTSPGVPWPSQSAN 

FPEFSFDFTREQLMEENESLKQELAKAKMALAE 

AHLEKDALLUHKKMTVE 


3134 


A 


9 


1579 


EEEGLSGGGPRVPCSLWGKQTMDYDFKAKLAA 

ERERVEDLFEYEQCKVGRGTYGHVYKARRKDG 

KDEBOEYALKQIEGTGISMSACREIALLRELKHPN 

VIALQKVFLSHSDRKVWLLFDYAEHDLWHIIKFH 

RASKANKKPMQLPRSMVKSLLYQILDGIHYLHA 

NWVLHRDLKPANILVMGEGPERGRVKIADMGF 

ARLFNSPLKPLADLDPVWTFWYRAPELLLGAR 

HYTKAIDIWAIGCIFAELLTSEPIFHCRQEDKTSN 

PFHHDQLDRIFSVMGFPADKDWEDIRKMPEYPT 

LQKDFRRTTYANSSLKYMEKHKVKPDSKVFLL 

LQKLLTMDPTKRITSEQALQDPYFQEDPLPTLDV 

FAGCQBPYPKREFLNEDDPEEKGDKNQQQQQNQ 

HQQPTAPPQQAAAPPQAPPPQQNSTXJTNGTAGG 

AGAGVGGTGAGLQHSQDSSLNQVPPNKKPRLGP 

SGANSGGPVMPSDYQHSSSRLNYQSSVQGSSQS 

QSTLGYSSSSQQSSQYHPSHQAHRY 


3135 


A 


3 


1111 

] 
] 
] 


ERKMAEPPSPVHCVAAAAPTATVSEKEPFGKLQ ' 

LSSRDPPGSLSAKKVRTEEKKAPRRVNGEGGSG 

GNSRQLQPPAAPSPQSYGSPASWSFAPLSAAPSPS 

SSRSSFSFSAGTAVPSSASASLSQPGPRKLLVPPTL 

i^rxAKlsr xirU-»LrL/r AAAAAASANAKSRRPKEKREKE 

RRRHGLGGAREAGGASREENGEVKPLPRDKIBCD 

KIKERDKEKEREKKKHKVMNEIKKENGEVm 

KSGKEKPKTMEDLQIBCKVKKKKKXKH^ 

KRPKMYSKSIQTICSGLLTDVEDQAAKGILNDNI 

n)YVGKNLDTKNYDSKIPENSEFPFVSLKEPRVQ 
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wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residoe of 
peptide 
sequence 


Amino acid sequence (A~Alanine C=Cysteine, I>=Aspartic Acid, 
E>=Glutamic Acid, F=Phenylalanine, G=Glycine« H^^^Histidine, 
I^Isoleucine, K«=*Lysine, Lr=Leuciae, M=Methioninef 
N-Asparagine, P=Proline, Q=Glutamine, R=ArginiDe, S=Serine, 
T=Threonine, V=Valine,>V=Tryptophan, y=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nucleotide deletion, 
Nppossible nucleotide insertion 










NNLKRLDTLEFKQLIHIEHQPNGGASVIHCLQ 


3136 


A 


1442 


682 


TAAMSIFTPTNQIRLT^^V^AVVIy^DKRAGKR^ 

YKNKVVGWRSGVEKDLDEVLQTHSVFVNVSKG 

QVAKKEDLISAFGTDDQTEICKQILTKGEVQVSD 

KERHTQLEQl^RDIATIVADKCVNPETKRPYTVI 

LIERAMKDIHYSVKTNKSTKQQALEVIKQLKEK 

MKIERAHMRLRFILPVNEGKKLKEKLKPLIKVffiS 

EDYGQQLEIVCLIDPGCFREIDELIKKETkGKGSL 

EVLNLKDVEEGDEKFE 


3137 


A 


1 


3143 


MVEGKRHVLHGGRQERMRAKQKGKPLKSSDL 

VRLIHYHHNSSPLHKQSSGPSSSPAAAAAPEKPG 

PKAAEVGDDFLGDFWGERVWVNGVKPGWQY 

LGETQFAPGQWAGVVLDDPVGKNDGAVGGVR 

YFECPALQGDFTRPSKLTRQPTAEGSGSDAHSVES 

LTAQNLSLHSGTATPPLTSRVIPLRESVLNSSVKT 

GNESGSNLSDSGSVKRGEKDLRLGDRVLVGGTK 

TGVVRYVGETDFAKGEWCGVELDEPLGKNDGA 

VAGTRYFQCPPKFGLFAPIHKVIRIGFPSTSPAKA 

KKTKRMAMGVSALTHSPSSSSISSVSSVASSVGG 

RPSRSGLLTETSSRYARKISGTTALQEALKEKQQ 

HffiQLLAERDLERAEVAKATSHICEVEKEIALLK 

AQHEQYVAEAEEKLQRARLLVESVRKEKVDLSN 

QLEEERRKVEDLQFRVEEESITKGDLETQTQLEH 

ARIGELEQSLLLEKAQAERLLRELADNRLTTVAE 

KSRVLQLEEELTLRRGEIEELQQCLLHSGPPPPDH 

PDAAEILRLRERLLSASKEHQRESGVLRDKYEKA 

LKAYQAENhDKLRAANEkYAQEVAGLKDkVQQ 

ATSENMGLMDNWSKLDSl^ 

TLNSGPGAQQKEIGELKAVMEGKMEHQLELGN 

LQAKHDLETAMHVKEKEALREKLQEAQEELAG 

LQRHWRAQLEVQASQHRLELQEAQDQRRDAEL 

RVHELEKLDVEYRGQAQAIEFLKJEQISLAEKKML 

DYERLQRAEAQGKQEVESLREKLLVAENRLQAV 

EALCSSQHTEIMIESNDISEETIRTKETVEGLQDKL 

NKRDKEVTALTSQTEMLRAQVSALESKCKSGEK 

KVDALLBCEKRRLEAELETVSRKTHDASGQLVLIS 

QELLRKERSLNELRVLLLEANRHSPGPERDLSRE 

VHKAEWRIKEQKLKDDIRGLREKLTGLDKEKSL 

SDQRRYSLEDPSSAPELLRLQHQLMSTEDALRDA 

LDQAQQVEKLMEAMRSCPDKAQTIGNSGSANGI 

HQQDKAQKQEDKH 


3138 


A 


no 


2499 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSAL 

TPSIWPQEILAKYTQKEESAEQPEFYYDEFGFRV 

YKEEGDEPGSSLLANSPLMEDAPQRLRWQAHLE 

FTHNHDVGDLTWDKIAVSLPRSEKLRSLVLAGff 

HGMRPQLWMRLSGALQKKRNSELSYREIVKNSS 

NDETIAAKQIEKDLLRTMPSNACFASMGSIGVPR 

LRRVLRALAWLYPEIGYCQGTGMVAACLLLFLE 

EEDAFWMMSAIIEDLLPASYFSTTLLGVQTDQRV 

LRHLIVQYLPRLDKLLQEHDDELSHTLHWFLTAF 

ASWDIKLLLRIWDLFFYEGSRVLFQLTLGMLHL 

KEEELIQSENSASIFNTLSDIPSQMEDAELLLGVA 

MRLAGSLTDVAVETQRRKHLAYLIADQGQLLGA 

GTLTNLSQWRRRTQRRKSTITALLFGEDDLEAL 

KAKhOKQTELVADLREAILRVARHFQCTDPKNCS 
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wo 01/57190 PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Aianine OCysteine, D=Aspartic Acid, 

Ei^Glutftinic Acid^ VssPhf^nvialaiilfiA /2sr^Kf/vtno ur u:..«:j: 

wiuuiHiim, r»uu) r A iicu/i<uauiDCj tjriycincj n^Uistidine< 
I=IsoIeucine, K=Lysine, L»Leucine, M=Methionine, 
N=Asparaginc I^Prolinc, Q=GlDtaniine» R=Arginine, S=Serine, 
T^Threonine, V=VaIinc W=Tryptophan, Y^Tyrosine, 
X»Unknown, *»=Stop codon,^90ss{bIe nucleotide deletion, 
^possible nucleotide insertion 










VVSRQLPGLLPNTALTPFIPLVGLCSLWQELTPD 

YSMESHQRDHENYVACSRSHRRRAKALLDPERH 

DDDELGFRKNDnnvSQKDEHCWVGELNGLRG 

WFPAKFVEVLDERSKEYSIAGDDSVTEGVTDLV 

RGTLCPALKALFEHGLKKPSLLGGACHPWLFIEE 

AAGREVERDFASVYSRLVLCKTFRLDEDGKVLT 

PEELLYRAVQSVNVTHDAVHAQMDVKLRSLICV 

GLNEQVLHLWLEVLCSSLPTVEKWYQPWSFLRS 

PGWVQIKCELRVLCCFAFSLSQDWELPAKREAQ 

QPLKEGVRDMLVKHHLFSWDVDG 


3139 


A 


110 


2499 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSAL 

TPSIWPQEILAKYTQKEESAEQPEFYYDEFGFRV 

YKEEGDEPGSSLLANSPLMEDAPQRLRWQAHLE 

FTHNHDVGDLTWDKIAVSLPRSEKLRSLVLAGIP 

HGMRPQLWMRLS.GALQKKRNSELSYREIVKNSS 

NDETIAAKQBEKDLLRTMPSNACFASMGSIGVPR 

LRRVLRALAWLYPEIGYCQGTGMVAACLLLFLE 

EEDAFWMMSAEEDLLPASYFSTTLLGVQTDQRV 

LRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAF 

ASWDIKLLLRIWDLFFYEGSRVLFQLTLGMLHL 

KEEELIQSENSASEFNTLSDIPSQMEDAELLLGVA 

MRLAGSLTDVAVETQRRKHLAYLIADQGQLLGA 

GTLTNLSQWRRRTQRRKSTITALLFGEDDLEAL 

KAKNIKQTELVADLREAILRVARHFQCTDPKNCS 

WSRQLPGLLPNTALTPPTPLVGLCSLWQELTPD 

YSMESHQRDHENYVACSRSHRRRAKALLDFERH 

DDDELGFRKNDIITIVSQKDEHCWVGELNGLRG 

WFPAKFVEVLDERSKEYSIAGDDSVTEGVTDLV 

RGTLCPALKALFEHGLKKPSLLGGACHPWLFIEE 

AAGREVERDFASVYSRLVLCKTFRLDEDGKVLT 

PBELLYRAVQSVNVTHDAVHAQMDVKLRSLICV 

GLNEQVLHLWLEVLCSSLPTVEKWYQPWSFLRS 

PGWVQKCELRVLCCFAFSLSQDWELPAKREAQ 

QPLKEGVRDMLVKHHLFSWDVDG 


3140 


A 


1 


4939 


SAALGASLAIPRPGLPGVHGRGPGTLSGRAMEG 

AEPRARPERLAEAETRAADGGRLVEVQLSGGAP 

WGFTLKGGREHGEPLVITKIEEGSKAAAVDKLL 

AGDEIVGENDIGLSGFRQEAICLVKGSHKTLKLV 

VKRRSELGWRPHSWHATKJ^SDSHPELAASPFTST 

SGCPSWSGRHHASSSSHDLSSSWEQTNLQRTLD 

HFSSLGSVDSLDHPSSRLSVAKSNSSIDHLGSHSK 

RDSAYGSFSTSSSTPDHTLSKADTSSAENILYTVG 

LWEAPRQGGRQAQAAGDPQGSEEKLSCFPPRVP 

GDSGKGPRPEYNAEPKLAAPGRSNFGPVWYVPD 

KKKAPSSPPPPPPPLRSDSFAATXSHEKAQGPVFS 

EAAAAQHFTALAQAQPRGDRRPELTDRPWRSAH 

PGSLGKGSGGPGCPQEAHADGSWPPSKDGASSR 

LQASLSSSDVRFPQSPHSGRHPPLYSDHSPLCADS 

LGQEPGAASFQNDSPPQVRGLSSCDQKLGSGWQ 

urKrUVl^uULQAAQLWAGCWPSDTALGALESL 

PPPTVGQSPRHHLPQPEGPPDARETGRCYPLDKG 

AEGCSAGAQEPPRASRAEKASQRLAASITWADG 

ESSRICPQETPLLHSLTQEGKRRPESSPEDSATRPP 

PFDAHVGKPTRRSDRFATTLRl^IQMHRAKLQK 

SRSTVALTAAGEAEDGTGRWRAGLGGGTQEGPL 
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SEQD) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amiDO 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine 0=Cysteine, D=A5partic Acid, 
E=GIutamic Acid, F=Phenylalanine, G^GIydne, H==Histidine, 
I-Isbleudne, K^Lysine, l;=Leucine, M=Metbionine, 
N»Asparagine, P^ProIinc, Q^lutamine, R^Arginine, S=Scrine, 
T^Thrconlne, V«Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible aueleotide deletion, 
\=possible nucleotide insertion 










AGTYKDHLKEAQARVLRATSFKRRDLDPNPGDL 
YPESLEHRMGDPDTVPHFWEAGLAQPPSSTSGGP 
HPPmGGRRRFTAEQKLKSYSEPEKMNEVGLTRG 
YSPHQHPRTSEDTVGTFADRWKFFEETSKPVPQR 
PAQKQALHGIPRDKPERPRTAGRTCEGTEPWSRT 
TSLGDSLNAHSAAEKAGTSDLPRRLGTFAEYQAS 
WKEQRKPLEARSSGRCHSADDILDVSLDPQERPQ 
HVHGRSRSSPSTDHYKQEASVELRRQAGDPGEP 
REELPSAVRAEEGQSTPRQADAQCREGSPGSQQ 
HPPSQKAPNPPTFSELSHCRGAPELPREGRGRAG 
TLPRDYRYSEESTPADLGPRAQSPGSPLHARGQD 
SWPVSSALLSKRPAPQRPPPPKREPRRYRATDGA 
PADAPVGVLGRPFPTPSPASLDVYVARLSLSHSPS 
VFSSAQPQDTPKATVCERGSQHVSGDASRPLPEA 
LLPPKQQHLRLQTATMETSRSPSPQFAPQKLTDK 
PPLLIQDEDSTRffiRVMDNNTTVKMVPIKIVHSES 
QPEKESRQSLACPAEPPALPHGLEKDQIKTLSTSE 
QFYSRFCLYTRQGAEPEAPHRAQPAEPQPLGTQV 
PPEKDRCTSPPGLSYMKAKEKTVEDLKSEELARE 
IVGKDKSLADILDPSVKIKTTMDLMEGIFPKDEH 
LLEEAQQREIKLLPKIPSPRSTEERKEEPSVPAAVS 
LATNSTYYSTSAPKAELLDCMKDLQEQQEHEEDS 
GSDLDHDLSVKKQELEESISRKLQVLREARESLLE 
DVQAlSnrVLGAEVEAIVKGVCKPSEFE)kFRMnG 
DLDKVVNLLLSLSGRLARVENALNNLDDGASPG 
DRQSLLEKQRVLIQQHEDAKELKENLDRRERIVF 
:^DmANYlSEESLADYEHFVKMKS^^ 
Km,GEiEQLKCLLDS^ . 


3141 


A 


97 


1894 


SPRGATMETPPLPPACTKQGHQKPLDSKDDNTE 

KHCPVIVNPWHMKKAFKVMNELRSQNLLCDVT 

IVAEDMEISAHRWLAACSPYFHAMFTGEMSESR 

AKRVRIKEVDGWTLRMLIDYVYTAEIQVTEENV 

QVLLPAAGLLQLQDVKKTCCEFLESQLHPVNCL 

GIRAFADMHACTDLLNKANTYAEQHFADWLSE 

EFLNLGIEQVCSLISSDKLTISSEEKVFEAVIAWV 

NHDKDVRQEFMARLMEHVRLPLLPREYLVQRV 

EEEALVKNSSACKNYLIEAMKYHLLPTEQRILMK 

SVRTRLRIPMNLPKLMVVVGGQAPKAIRSAECY 

DFKEQRWHQVAELPSRRCRAGMVYLAGLVFAV 

GGFNGSUlVRT\aDSYDPVKDQWTSVANMRDRR 

STLGAAVLNGLLYAVGGFDGSTGLSSVEAYNDCS 

NEWFHVAPMNTRRSSVGVGVVGGLLYAVGGYD 

GASRQYLSTVECYNATTNEWTYIAEMSTRRSGA 

GVGVL>nsnLLYAVGGHDGPL\^SVEVYDPTTN 

AWRQVADMNMCRRNAGVCAVNGLLYVVGGD 

DGSCmASVEYYNPTTDKWTVVSSCMSTGRSYA 

GVTVIDKPL 


3142 


A 


1211 


1311 


FSl^TTEKVAHAKEENLSMHQMLDQTLLELNN 
M 


3143 


A 


1809 


1041 


SEELDREKKLKEDSPRKTPNKESGVPSLPVSLTSI 

KEEPKEAKHPDSQSMEESKLKNDDRKTPVNWK 

DSRGTRVAVSSPMSQHQSYIQYLHAYPYPQMYD 

PSHPAYRAVSPVLMHSYPGAYLSPGFHYPVYGK 

MSGREETEKVNTSPSVNTKTTTESKALDLLQQH 

ANQYRSKSPAPVEKATAEREREAERERDRHSPFG 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A~AIanine OCysteine» D=Aspartic Add, 
E^Glutamic Acid, F'^'Phenylalanine, Cr=Glyclne, H-Histidine, 
I=Isoleucine,K=Lysine, L^Leucine, M=Methlonine, 
N=Asparagine, P=ProHne, Q^GIutamine, R=Arginine, S=Scrine, 
T=Threonmc, V^Valine, W=Tryptopnan, Y=Tyrosine, 
X-Unknown, *=Stop codon» /-possible nudeotide deletion, 
\=possibIe nudeotide insertion 










QRHLHTHHHIHVGMGYPLIPGQYDPFQGLTSAA 
LVASQQVAAQASASGMFPGQRR 


3144 


A 


78 


604 


SVSGIVLDLLPYLHFLSNMNLDGSAQDPEKREYS 

SVCVGREDDKKSERMTAVVHDREVVIFYHKGE 

YHAMDIRCYHSGGPLHLGDIEDFDGRFCrVCPW 

HKYKITLATGEGLYQSINPKDPSAKPKWCSKGIK 

QRfflTVTVDNGNIYVTLSNEPFKCDSDFYATGDF 

KVKSSS 


3145 


A 


2 


333 


RNSLLLPPLHLDNSTPAKMSCQQNQQQCQPPPK 
CPSPKCPPKSPVQCLPPASSGCAPSSGGCGPSSEG 
GCFLNHHRRHHRCRRQRPNSCDRGSGQQGGGS 
GCGHGSGGCC 


3146 


A 


3 


1151 


VCTALQEFGTRSTLLRCLDSGFRPGASRGLVGSW 

AAMESTLGAGIVIAEALQNQLAWLENVWLWITF 

LGDPKILFLFYFPAAYYASRRVGIAVLWISLITEW 

LNLIFKWFLFGDRPFWWVHESGYYSQAPAQVHQ 

FPSSCETGPGSPSGHCMITGAALWPIMTALSSQV 

ATRARSRAWRVMPSLAYCTFLLAVGLSRIFILAH 

FPHQ\a,AGLITGAVLGWLMTPRVPMERELSFYG 

LTALALMLGTSLryWTLFTLGLDLSWSISLAFKW 

CERPEWIHVDSRPFASLSRDSGAALGLGIALHSPC 

YAQVRRAQLGNGQKIACLVLAMGLLGPLDWLG 

HPPQISLFYIFNFLKYTLWPCLVLALVPWAVHMF 

SAQEAPPIHSS 


3147 


A 


1437 


594 


RSFSLSFSLLSPSEMMALGAAGATRVFVAMVAA 

ALGGHPLLGVSATLNSVLNSNAIKNLPPPLGGAA 

GHPGSAVSAAPGILYPGGNKYQTIDNYQPYPCAE 

DEECGTDEYCASPTRGGDAGVQICLACRKRRKR 

CMRHAMCCPGNYCKNGICVSSDQNHFRGEIEETI 

TESFGNDHSTLDGYSRRTTLSSKMYHTKGQEGS 

VCLRSSDCASGLCCARHFWSKICKPVLKEGQVC 

TKHRRKGSHGLEIFQRCYCGEGLSCRIQKDHHQ 

ASNSSRLHTCQRH 


3148 


A 


1 


1562 


MSTI^YDIRAHKAQLLRFFASSDSNKALEQRRTLH 

TPKLEHLDRVLYEWFLGKRSEGVPVSGPMLIEK 

AKDFYEQMQLTEPCVFSGGWLWRFKARHGIKK 

LDASSEKQSADHQAAEQFCAFFRSLAAEHGLSA 

EQVYNADETGLFWRCLPNPTPEGGAVPGPKQGK 

DRLTVLMCANATGSHRLKPLAIGKCSGPRAFKGI 

QHLPVAYKAQGNAWVDKEIFSDWFHfflFVPSVR 

EHFRTIGLPEDSKAVLLLDSSRAHPQEAELVSSN 

VFTXFLPASVASLVQPMEQGnaODFMRNFINPPVP 

LQGPHARYNMNDAIFSVACAWNAVPSHVFRRA 

WRKLWPSVAFAEGSSSEEELEAECFPVKPHNKSF 

AHILELVKEGSSCPGQLRQRQAASWGVAGREAE 

GGRPPAATSPAEVVWSSEKTPKADQDGRGDPGE 

GEEVAWEQAAVAFDAVLRFAERQPCFSAQEVG 

QLRALRAVFRSQQQVRRRRGALGAWKVEALQ 

EGPGGCGATAQSPLPCSSTAGDN 


J 149 


A 


132 


4125 


VAVMISTAPLYS G VHN WTSSDRIRMCGINEERRA 

PLSDEESTTGDCQHFGSQEFCVSSSFSKVELTAV 

GSGSNARGADPDGSATEKLGHKSEDKPDDPQPK 

MDYAGNVAEAEGLLVPLSSPGDGLKLPASDSAE 

ASNSRADCSWTPLNTQMSKQVDCSPAGVKALDS 

RQGVGEKNTFILAILGTGWVEGTLPLVTTNFSP 
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SEQH) 
NO; 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine 0=Cysteinc, D^Aspartic Acid, 
EKSIutamic Add, F-PbenyIalanine,,G==GlydDe, H'^Histidine, 
I^Isoleudne, K=Lysine, L=l^ucinc, M=Metfaionine, 
N=Asparaginc, P=Proline, Q=GIutamine, R-Arginine, S=Scrinc, 
T=Threonine, V=Vannc, W«Tryptophan, Y«Tyrosine, 
X=Un known, *«5top codon, ^^possible nudeotide deletion, 
V^ossiblc nudeotide insertion 










LPAPICPPAPSSASVPHSVPDAFQAPVPPSAPTLVL 

APVPTPVLAPMPASTPPAAPAPPSVPMPTPTPSSG 

PPSTPTLIPAFAPTPVPAPTPAPIFTPAPTPMPAATP 

AAIPTSAPIPASFSLSRVCFPAAQAPAMQKVPLSF 

QPGTVLTPSQPLVYIPPPSCGQPLSVATLPTTLGV 

SSTLTLPVLPSYLQDRCLPGVLASPELRSYPYAFS 

VARPLTSDSKLVSLEVNRLPCTSPSGSTTTQPAPD 

GVPGPLADTSLVTASAKVLPTPQPLLPAPSGSSAP 

PHPAKMPSGTEQQTEGTSVTFSPLKSPPQLEREM 

ASPPECSEMPLDLSSKSNRQKLPLPNQRKTPPMP 

VLTPVHTSSKALLSTVLSRSQRTTQAAGGNVTSC 

LGSTSSPFVIFPEIVRNGDPSTWVKNSTALISTIPG 

TYVGVANPVPASLLLNKDPNLGLNRDPRHLPKQ 

EPISIIDQGEPKGTGATCGKKGSQAGAEGQPSTV 

KRYTPARIAPGLPGCQTKELSLWKPTGPANIYPR 

CSVNGKPTSTQVLPVGWSPYHQASLLSIGISSAG 

QLTPSQGAPIRPTSWSEFSGVPSLSSSEAVHGLP 

EGQPRPGGSFVPEQDPVTKNKTCRIAAKPYEEQV 

NPVLLTLSPQTGTLALSVQPSGGDIRMNQGPEES 

ESHLCSDSTPBCMEGPQGACGLKLAGDTKPKNQV 

LATYMSHELVLATPQNLPKMPELPLLPHDSHPKE 

LILDVVPSSJOIGSSTERPQLGSQVDLGRVKMEKV 

DGDWFNLATCFRADGLPVAPQRGQAEVRAKA 

GQARVKQESVGVFACKNKWQPDDVTESLPPKK 

MKCGKEKDSEEQQLQPQAKAWRSSHRPKCRK 

LPSDPQESTKKSPRGASDSGKEHNGVRGKHKHR 

Ba>TKPESQSPGKRAbSHEEGSLEKJ^ 

VVLSTRTRSQSDLKARKQKTSSSQSLEHRLRNRN 

LLLPNKVQGISDSPNGFLPNNLEEPACLENSEKPS 

GKRKCKTKHMATVSEEAKGKGRWSQQKTRSPK 

SPTPVKPTEPCTPSKSRSASSEEASESPTARQIPPE 

ARRLIVNKNAGETLLQRAARLGYKDWLYCLQK 

DSEDVNHRDNAGYTALHEACSRGWTDILNILLE 

HGA 


3150 


A 


3 


2795 


SLRMHNLSILVRQIKFYYQETLQQLIMMSLPNVLI 
IGKNPFSEQGTEEVKKLLLLLLGCAVQCQKKEEF 
ffiRIQGLDFDTKAAVAAfflQEVTHNQENVFDLQ 

wmevtdmsqediepllknmalhlkrliderdeh 

setiielseerdglhflphasssaqspcgspgmkr 

tesrqhlsveladakakirrlrqeleekteqlld 

CKQELEQMEffiLKRLQQENMhn.LSDARSARMYR 

deldalrekavrvdklesevsrykerlhdiefy 
karveelkednqvlleixtmledqlegtrarsd 

KLHELEKENLQLKAKLHDMEMERDMDRKKIEE 

LI^ENMTLEMAQKQSIVIDESLHLGWELEQISRTS 

ELSEAPQKSLGHEVNELTSSRLLKLEMENQSLTK 

TVEELRTTVDSVEGNASKILKMEKENQRLSKKV 

EILENEIVQEKQSLQNCQNLSKDLMKEKAQLEKT 

IETLRENSERQIKILEQENEHLNQTYSSLRQRSQIS 

AEARVKDffiKENKILHESKETSSKLSKIEFEKRQI 

KKELEHYKEKGERAEELENELHHLEKENELLQK 

KITNLKITCEKIEALEQENSELERENRKLKKTLDS 

FKNLTFQLESLEKENSQLDEENLELRRNVESLKC 

asmkmaqlqlenkelesekeqlkkglellkasf 
kkterlevsyqgldienqrlqktlensnkkiqql 
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S£QID 
INOI 


Method 


Predicted 

KMii nnl ncF 
DCgluiiiiig 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D^Aspartic Acid, 
£»GIutaniic Add, F=«PIicnylaIanlne, G=Glydnc, H=Histidine, 
I=Isoleuclne, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P==Proline, Q=GIutamine, R=Arginine, S^Serinc, 
T=Thrconine, V=Valine, W=Tryptophan, Y==Tyrosine, 
X^'Unknown, *«Stop codon, /^possible nudeotide deletion, 
^F=poss^ble nudeotide insertion 










ESELQDLEMENQTLQKNLEELKISSKRLEQLEKE 

^IKSLEQETSQLEKDKKQLEKENKRLRQQAEIKD 

TTLEENIWKIGNLEKENKTLSKEIGIYKESCVRLE 

ELEKENKELVKRATIDIKTLVTLREDLVSEBCLKT 

QQMNNDLEKLTHELEKJGLNKERLLHDEQSTO 

SRYKLLESKLESTLKKSLEIKEEKIAALEARLEES 

TNYNQQLRQELKTVKKK 


3151 


A. 


2 


2515 


GFWLHLH.LGASLPAALGWMDPGTSRGPDVGV 

GESQAEEPRSFEVTWIEGLSSHNELLASCGKKFC 

SRGSRCVLSRKTGEPECQCLEACRPSYVPVCGSD 

GRFYENHCKLHRAACLLGKRITVmSBCDCFLKGD 

TCTMAGYARLKNVLLALQTRLQPLQEGDSRQDP 

ASQKRLLVESLFRDLDADGNGHLSSSELAQHVL 

KKQDLDEDLLGCSPGDLLRFDDYNSDSSLTLREF 

YMAFQWQLSLAPEDRVSVTTVTVGLSTVLTCA 

VHGDLRPPnWKRNGLTLNFLDLEDINDFGEDDS 

LYITKVTTIHMGNYTCHASGHEQLFQTHVLQVN 

VPPVIRVYPESQAQEPGVAASLRCHAEGIPMPRIT 

WLKNGVDVSTQMSKQLSLLANGSELHISSVRYE 

DTGAYTCIAKNEVGVDEDISSLFIEDSARKTLANI 

LWREEGLSVGNMFYVFSDDGIIVIHPVDCEIQRH 

LKPTEKIFMSYEEICPQREKNATQPCQWVSAVNV 

RNRYIYVAQPALSRVLVVDIQAHKVLQSIGVDPL 

PAKLSYDKSHDQVWVLSWGDVHKSRPSLQVITE 

ASTGQSQHLIRTPFAGVDDFFIPPTNLIINHIRFGFI 

FNKSDPAVrocVDLETMMPLKTIGLHHHGCVPQA 

MAHTHLGGii?TFIQGRQDSPASAARQLLVDSVTD 

SVLGPNGDVTGTPHTSPDGRFIVSAAADSPWLHV 

QEITVRGEIQTLYDLQINSGISDLAFQRSFTESNQ 

YNIYAALHTEPDLLFLELSTGKVGMLKNLKEPPA 

GPAQPWGGTHRIMRDSGLFGQYLLTPARESLFLI 

NGRQNTLRCEVSGIKGGTTVVWVGEV 


3152 


A 


1 


2645 


GAGWQVSLTGRWSPGREAGAGEVRQDPGSTAA 

SPSSCDADLSARMARGERRRRAVPAEGVRTAER 

AARGGPGRRDGRGGGPRSTAGGVALAVVVLSL 

ALGMSGRWVLAWYRARRAVTLHSAPAVLPADS 

SSPAVAPDLFWGTYRPHVYFGMKTRSPKPLLTG 

LMWAQQGTTPGTPKLRHTCEQGDGVGPYGWEF 

HDGLSFGRQHIQDGALRLTTEFVKRPGGQHGGD 

WSWRVTVEPQDSGTSALPLVSLFFYWTDGKEV 

LLPEVGAKGQLKFISGHTSELGDFRFTLLPPTSPG 

DTAPKYGSY^^VFWTSNPGLPLLTEMVKSRLNSW 

FQHRPPGASPERYLGLPGSLKWEDRGPSGQGQG 

QFLIQQVTLKIPISIEFVFESGSAQAGGNQALPRLA 

GSLLTQALESHAEGFRERFEKTFQLKEKGLSSGE 

QVLGQAALSGLLGGIGYFYGQGLVLPDIGVEGSE 

QKVDPALFPPVPLFTAVPSRSFFPRGFLWDEGFH 

QLWQRWDPSLTREALGHWLGLLNADGWIGRE 

QILGDEARARVPPEFLVQRAVHANPPTLLLPVAH 

\>iT pvrtnpnDT aft RKALPRLHAWFSWLHOSOA 

GPLPLSYRWRGRDPALPTLLNPKTLPSGLDDYPR 

ASHPSVTERHLDLRCWVALGARVLTRLAEHLGE 

AEVAAELGPLAASLEAAESLDELHWAPELGVFA 

DFGNHTKAVQLKPRPPQGLVRWGRPQPQLQYV 

DALGYVSLFPLLLRLLDPTSSRLGPLLDILADSRH 



267 



wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 


Metbod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C^Cysteine, D^Aspartic Acid, 
£=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Hi5tidine, 
I-Isoleucine, K^Lysine, L==Leudne, M«Methionine, 
N=Asparagine, JHProIine, Q=Glutaraine, R^Arginine, S=Serinc, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X>=Unknoivn, *=Stop codon, /^possible nadeotidc deletion, 
possible nudeotlde insertion 










LWSPFGLRSLAASSSFYGQRNSEHDPPYWRGAV 
WLNVNYLALGALHHyOHLEGPHQARAAKLHGE 
LRANWGNVWRQYQATGFLAVEQYSDRDGRGM 
GCRPFHGWTSLVLLAMAEDY 


3153 


A 


1 


4312 


MVIKTDELPAAAPADSAKEHGSQAGGKGRPGAA 
AVLLADLERDARQGECALPGAAMAGLAPLKPE 
ASRSSSPGPTGCIRARVAAEAGTR>TPGNAGAELE 
SWLPCCHGHPETPEPRGGQLPTAPELPSVMLLNG 
DCPESLKKEAAAAEPPRENGLDEAGPGDETTGQ 
EVIVIQDTGFSVKILAPGIEPFSLQVSPQEMVQEIH 
QVLMDREDTCHRTCFSLHLDGNVLDHFSELRSV 
EGLQEGSVLRWEEPYTVREARIHVRHVRDLLKS 
LDPSDAFNGVDCNSLSFLSVFTDGDLGDSGKRK 
KGLEMDPIDCTPPEYILPGSRERPLCPLQPQNRD 
WKPLQCLKVLIMSGWNPPPGNRKMHGDLMYLF 
VITAEDRQVSITASTRGFYLNQSTAYHFNPKPASP 
RFLSHSLVELLNQISPTFKKNFAVLQKKRVQRHP 
FERIATPFQVYSWTAPQAEHAMDCVRAEDAYTS 
RLGYEEfflPGQTRDWNEELQTTRELPRKNLPERL 
LRERAIFKVHSDFTAAAmGAlvlA.VIDGNV]y[AIN 
PSEETKMQMFIWNNIFFSLGFDVRDHYKDFGGD 
VAAYVAPTNDLNGVRTYNAVDVEGLYTLGTVV 
VDYRGYRVTAQSIIPGILERDQEQSVIYGSIDFGK 
TVVSHPRYLELLERTSRPLKILRHQVLNDRDEEV 
ELCSSVECKGnGNDGRHYILDLLRTFPPDLNFLP 
VPGEELPEECARAGFPRAHRHKLCCLRQELVDA 
' F^iTBHRYLLFMKLAALQLM 
GGPSSLESKSEDPPGQEAGSEEEGSSASGLAKVK 
ELAETIAADDGTDPRSREVIRNACKAVGSISSTAF 
DIRFNPDIFSPGVRFPESCQDEVRDQKQLLKDAA 
AFLLSCQIPGLVKDCMEHAVLPVDGATLAEVMR 
QRGINMRYLGKVLELVLRSPARHQLDHVFKIGIG 
ELITRSAKHBFKTYLQGVELSGLSAAISHFLNCFLS 
SYPNPVAHLPADELVSKKRNKRRKNRPPGAADN 
TAWAVMTPQELWKNICQEAKNYFDFDLECETV 
DQAVETYGLQBaTLLREISLKTGIQVLLKEYSFDS 
RHKPAFTEEDVLhHFPWKHVNPKASDAFHFFQS 
GQAKVQQGFLKEGCELINEALNliWfVYGAMH 
VETCACLRLLARLHYIMGDYAEALSNQQKAVL 
MSERVMGTEHPNTIQEYMHLALYCFASSQLSTA 
LSLLYRARYLMLLVFGEDHPEMALLDNNIGLVL 
HGVMEYDLSLRFLENALAVSTKYHGPKALKVAL 
SHHLVARVYESKAEFRSALQHEKEGYTIYKTQL 
GEDHEKTKESSEYLKCLTQQAVALQRTMNEIYR 
NGSSANIPPIJOTAPSMASVLEQLNVINGILFIPLS 
QKDLENLKAEVARJRHQLQEASRNRDRAEEPMA 
TEPAPAGAPGDLGSQPPAAKDPSPSVQG 


3154 


A 


416 


4082 


KFKLIKIMLLTLIILLPVVSKFSFVSLSAPQHWSCP 

EGTLAGNGNSTCVGPAPFLIFSHGNSIFRIDTEGT 

NYEQLWDAGVSVIMDFHYNEKRIYWVDLERQ 

LLQRVFLNGSRQERVO^Kl^SGMAINWINEEV 

IWSNQQEGIITVTDMKGNNSHILLSALKYPANVA 

VDPVERFIFWSSEVAGSLYRADLDGVGVKALLE 

TSEKITAVSLDVLDKRLFWIQYNREGSNSLICSCD 

YDGGSVfflSKHPTQHNLFAMSLFGDRIFYSTWK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleoHde 

location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine OCystelnc, D^Aspartic Add, 
E^Glutamic Acid, F^^benylalanine, G^GIycine, H=Hlstidine, 
I^Isoleudnc, K»Lysine, LpLeudne, M^Methiooine, 
N»Asparagine, P=Proline, Q=Giutamine, R=Arginine, S^Serine, 
T»Tbreonine, V«=ValiDe, W=Tryptophan, Y^Tyrosinc, 
X-Unlinown, *=Stop codon, /=possibIe nucleotide deletion, 
V>p05Sible nudeotide insertion 










MKTIWIANKHTGKDMVRINLHSSFVPLGELKVV 

HPLAQPKAEDDTWEPEQKLCKLRKGNCSSTVCG 

QDLQSHLCMCAEGYALSRDRKYCEGNDWKYCE 

DVNECAFWNHGCTLGCKNTPGSYYCTCPVGFVL 

LPDGKRCHQLVSCPRNVSECSHDCVLTSEGPLCF 

CPEGSVLERDGKTCSGCSSPDNGGCSQLCVPLSP 

VSWECDCFPGYDLQLDEKSCAASGPQPFLLFANS 

QDIRHMHFDGTDYGTLLSQQMGMVYALDHDPV 

ENmFAHTALKWIERANMDGSQRERLIEEGVD 

VPEGLAVDWIGRRFYWTDRGKSLIGRSDLNGKR 

SKnTIENISQPRGIAVHPMAKRLFWTDTGINPRIE 

SSSLQGLGRLVIASSDLIWPSGITIDFLTDKLYWC 

DAKQSVIEMANLDGSKRRRLTQNDVGHPFAVA 

VFEDYVWFSDWAMPSVIRVNKRTGKDRVRLQG 

SMLKPSSLVVVHPLAKPGADPCLYQNGGCEHIC 

KKRLGTAWCSCREGFMKASDGKTCLALDGHQL 

LAGGEVDLKNQVTPLDILSKTRVSEDNITESQHM 

LVAEIMVSDQDDCAPVGCSMYARCISEGEDATC 

QCLKGFAGDGKLCSDIDECEMGVPVCPPASSKCI 

NTEGGYVCRCSEGYQGDGIHCLDIDECQLGVHS 

CGENASCTNTEGGYTCMCAGRLSEPGLICPDSTP 

PPHLREDDHHYSVRNSDSECPLSHDGYCLHDGV 

CMYIEALDKYACNCWGYIGERCQYRDLKWWE 

LRHAGHGQQQKVIWAVCVWLVMLLLLSLWG 

AHYYRTQKLLSKNPKNPYEESSRDVRSRRPADT 

EDGMSSCPQPWFWIKEHQDLKNGGQPVAGED 

GQAADGSMQPTSWRQEPQLCGMGTEQGCWIPV 

SSDKGSCPQVMERSFHMPSYGTQTLEGGVEKPH 

SLLSANPLWQQRALDPPHQMELTQ 


3155 


A 


533 


212 


GTSGWYWERLAERRGRLWSREEAMATMENKVI 
CALVLVSMLALGTLAEAQTETCTVAPRERQNCG 
FPGVTPSQCANKGCCFDDTVRGVPWCFYPNTE) 
VPPEEECEF 


3156 


A 


2 


1585 


PRVRAADVAAGAQAWSAGMAKSNGENGPRAP 

AAGESLSGTRESLAQGPDAATTDELSSLGSDSEA 

NGFAERRIDKFGFIVGSQGAEGALEEVPLEVLRQ 

RESKWLDMLISWWDKWMAKKHKKIRLRCQKGI 

PPSLRGRAWQYLSGGKVKLQQNPGKFDELDMSP 

GDPKWLDVIERDLHRQFPFHEMFVSRGGHGQQD 

LFRVLKAYTLYRPEEGYCQAQAPIAAVLLMHMP 

AEQAFWCLVQICEKYLPGYYSEKLEAIQLDGEIL 

FSLLQKVSPVAHKHLSRQKIDPLLYMTEWFMCA 

FSRTLPWSSVLRVWDMFFCEGVKIIFRVGLVLLK 

HALGSPEKVKACQGQYETIERLRSLSPKIMQEAF 

LVQEVVELPVTERQIEREHLLQLRRWQETRGELQ 

CRSPPRLHGAKAILDAEPGPRPALQPSPSIRLPLD 

APLPGSKAKPKPPKQAQKEQRKQMKGRGQLEKP 

PAPNQAMWAAAGDACPPQHVPPKDSAPKDSAP 

ODLAPQVSAHHRSQESLTSQESEDTYL 


3157 


A 


3 


601 


HRFRALDRMCKGYLSRMDLQQIGALAVNPLGDR 

IIESFFPDGSQRVDFPGFVRVLAHFRPVEDEDTET 

QDPKKPEPLNSRR>4KLHYAFQLYDLDRDGKISR 

HEMLQVLRLMVGVQVTEEQLENIADRTVQEAD 

EDGDGAVSFVEFTECSLEKMDVEHKMSIRILK 
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SEQID 
NO: 


Metbod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, INAspartic Acid, 
&=Gtntaroic Acid, F=PhenyIalaninc, G=Glycine, H^Histidine, 
I-Isoleucine, K=Lysine, l^Lcucinc, M=Methionine, 
N»Asparagine, P=ProIine, Q=Glutaraine, R=Arginine, S=Serine, 
T=Threonine, V«VaIine, W=Tr}'ptophan, Y=Tyrosine, 
X-Unknown, *=5tDp codon, /^possible nucleotide deletion, 
\=posslble nucleotide insertion 


3158 


A 


2 


409 


ISSCPHTAYEGSMSTLSNFTQTLEDWRRIFITYM 
DhWRQNTTAEQEALQAKVDAENFYYVILYLMV 
mGMFSFnVAILVSTVKSKRREHSNDPYHQYIVE 
DWQEKYKSQILNLEESKATIHENIGAAGFKMSP 


3159 


A 


3 


416 


PWGAAELDMGRRDAQLLAALLVLGLCALAGSE 

KPSPCQCSRLSPHNRTNCGFPGITSDQCFDNGCCF 
DSSVTGVPWCFHPLPKQESDQCVMEVSDRRNCG 
YPGISPEECASRKCCFSNFIFEVPWCFFPKSVEDC 
HY 


3160 


A 


179 


409 


KPKTBaLKMVYYPELFVWVSQEPFPNKDMEGRL 
PKGRLPWKEVNRKKNDETNAASLTPLGSSELRS 
PRISYLHFF 


3161 


A 


683 


1186 


LSSTGGLHAAACAAAMSLVIPEKFQHILRVLNTN 

roGRRKIAFAITAIKGVGRRYAHVVLRKADIDLT 

KRAGELTEDEVERVITIMQNPRQYKIPDWFLNRQ 

KDVKDGKYSQVLANGLDNKI.REDLERLKKIRA 

HRGLRHFWGLRVRGQHTKTTGRRGRTVGVSKK 

K 


3162 


A 


1 


1938 


GMPRSRGGRAAPGPPPPPPPPGQAPRWSRWRVP 

GRLLLLLLPALCCLPGAARAAAAAAGAGNRAA 

VAVAVARADEAEAPFAGQNWLKSYGYLLPYDS 

RASALHSAKALQSAVSTMQQFYGIPVTGVLDQT 

TffiWMKKPRCGVPDHPHLSRRRRNKRYALTGQK 

WRQKHITYSIHNYTPKVGELDTRKAIRQAFDVW 

QKVTPLTFEEVPYHEIKSDRKEADIMIFFASGFHG 

DSSPFDGEGGFLAHAYFPGPGIGGDTHFDSDEPW 

TLGNAiraDGNbLFtVAVHEUjHALGEEHSSDPS ' 

AIMAPFYQYNffilTO^^ 

PLEPTRPLPTLPVRRIHSPSERKHERQPRPPRPPLG 

DRPSTPGTKPNICDGNFNTVALFRGBMFVFBCDR 

WFWRLKNNRVQEGYPMQIEQFWKGLPARIDAA 

YERADGRFVFFKGDKYWVFKEVTVEPGYPHSLG 

ELGSCLPREGIDTALRWEPVGKTYFFKGERYWR 

YSEERRATDPGYPKPrrVWKGIPQAPQGAFISKE 

GYYTYFYKGRDYWKFDNQKLSVEPGYPRhnLRD 

WMGCNQKEVERRKERRLPQDDVDIMVTINDVP 

GSVNAVAVVIPCILSLCILVLVYTIFQFKNKTGPQ 

PVTYYKRPVQEWV 


3163 


A 


1235 


2223 


SRLSLQFYVSFRRTGLFTCKLrNnEIFFWm«NDSL 

RTNVFVRFQPETIACACIYLAARALQIPLPTRPHW 

FLLFGTTEEEIQEICIETLRLYTRKKPNYELLEKEV 

EKRKVALQEAKLKAKGLNPDGTPALSTLGGFSP 

ASKPSSPREVKAEEKSPISINVKTVKKEPEDRQQA 

SKSPYNGVRKDSKRSRNSRSASRSRSRTRSRSRS 

HTPRRHYNNRRSRSGTYSSRSRSRSRSHSESPRR 

HHNHGSPHLBCAKHTRDDLKSSKRHGHKRKKSRS 

RSQSKSRDHSDAAKKHRHERGHHRDRRERSRSF 

ERSHKSKHHGGSRSGHGRHRR 


3164 


A 


3 


3274 


DCRLQAAMPTNFTVVPVEAHADGGGDETAERT 

EAPGTPEGPEPERPSPGDGNPRENSPFLNNVEVE 

QESFFEGK>nviALFEEEMDSNPMVSSLLl^KLANY 

TNLSQGVVEHEEDEESRRREAKAPRMGTFIGVY 

LPCLQNILGVLLFLRLTWIVGVAGVLESFLIVAMC 

CTCTMLTAISMSAIATNGVVPAGGSYYMISRSLG 

PEFGGAVGLCFYLGTTFAGAMYILGTIEIFLTYISP 
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S£Qn> 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amtDO 
acid residue of 
peptide 

scqocnce 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc CK:ysteine, D^Aspartic Add, 
ir=niitaniir Add F=PhenYlalaniDe. G^Glycine, H»Hlstidine, 
I^'Isoleucine, K=Lysine, L?=Leucinc, M-Methionine, 
I*=AsparagiDe, P-Proline, Q^GIutamine, R^Arginine, S^Serine, 
T=Thrconine, V=Valine, W=^ryptophan> Y^Tyrosinc, 
X^TJnknown, *'«=Stop codon,AT)OSsible nucleotide deletion, 
V=possible nucleotide insertion 










GAAIFQAEAAGGEAAAMLHNMRVYGTCTLVLM 

ALVVFVGVKYVNKLALVFLACVVLSILAIYAGVI 

KSAFDPPDPVCLLGNRTLSRRSFDACVKAYGIH 

NNSATSALWGLFCNGSQPSAACDEYFIQNNVTEI 

QGIPGAASGVFLENLWSTYAHAGAFVEKKGVPS 

VPVAEESRASTLPYVLTDIAASFTLLVGIYFPSVT 

GIMAGSNRSGDLKDAQKSIPTGTILAIVTTSFIYLS 

CIVLFGACIEGWLRDKFGEALQGNLVIGMLAW 

PSPWVIVIGSFFSTCGAGLQTLTGAPRLLQAIARD 

GIVPFLQVFGHGKANGEPTWALLLTVLICETGILI 

ASLDSVAPILSMFFLMCYLFVNLACAVQTLLRTP 

NWRPRFKJ^YHWTLSFLGMSLCLALMFICSWYYA 

LSAMLIAGCIYKYIEYRGAEKEWGDGIRGLSLNA 

ARYALLRVEHGPPHTKNWRPQVLVMLNLDAEQ 

AMKHPRLLSFTSQLKAGKGLTIVGSVLEGTYLD 

KHMEAQRAEENIRSLMSTEKTKGFCQLWSSSLR 

DGMSHLIQSAGLGGLKHNTVLMAWPASWKQED 

NPFSWKNFVDTVRDTTAAHQALLVAKNVDSFPQ 

NQERFGGGHmVWWIVHDGGMLMLLPFLLRQH 

KVWRKCRMRIFTVAQVDDNSIQMKKDLQMFLY 

HLRISAEVEVVEMVENDISAFTYERTLMMEQRS 

QMLKQMQLSKNEQEREAQLIHDRNTASHTAAA 

ARTQAPPTPDKVQMTWTREKLIAEKYRSRDTSL 

SGFKDLFSMKPDQSNVRRMHTAVKLNGVVLNK 

SQDAQLVLLNMPGPPKNRQGDENYMEFLEVLTE 

GLNRVLLVRGGGREVITIYS 


3165 


A 


3 ; , : 


.2681 - 


GRGARGGSGAGALRGCRGYLQKLSGKGPSRGY 

RSRWFVFDARRCYLYYFKSPQDALPLGHLDIAD 

ACFSYQGPDEAAEPGTEPPAHFQVHSAGAVTVL 

KAPNRQLMTYWLQELQQKRWEYCNSLDMVKW 

DSRTSPTPGDFPKGLVARDNTDLIYPHPNASAEK 

AKNVLAVETVPGELVGEQAANQPAPGHPNSINF 

YSLKQWGNELKNSMSSFRPGRGHNDSRRTVFYT 

NEEWELLDPTPKDLEESIVQEEKKKLTPEGNKGV 

TGSGFPFDFGRNPYKGKRPLBCDIIGSYKNRHSSG 

DPSSEGTSGSGSVSIRKPASEMQLQVQSQQEELE 

QLKKDLSSQKELVRLLQQTVRSSQYDKYFTSSRL 

CEGVPKDTLELLHQKDDQILGLTSQLERFSLEKE 

SLQQEVRTLKSKVGELNEQLGMLMETIQAKDEV 

nKLSEGEGNGPPPTVAPSSPSVVPVARDQLELDR 

LKDNLQGYKTQNKFLNKEILELSALRRNPERRER 

DLMARNSSLEAKLCQffiSKYLILLQEMKTPVCSE 

DQGPTREVIAQLLEDALQVESQEQPEQAFVKPHL 

VSEYDIYGFRTVPEDDEEEKLVAKVRALDLKTL 

YLTENQEVSTGVKWE>rYFASTVNREMMCSPEL 

KNLIRAGIPHEimSKVWKWCVDRHTRKFKDNTE 

PGHFQTLLQKALEKQNPASKQIELDLLRTLPNNK 

HYSCPTSEGIQKLRNVLLAFSWKNPDIGYCQGLN 

RLVAVALLYLEQEDAFWCLVTIVEVFMPRDYYT 

xrTT T /^orwmnDVTTPnT lufWK'T PRT NfiWFOYKV 
KTLLGSC^VJL'v^KVrl\iJJ-iMi3iiivi^ ^ xv v 

DYTLITFNWFLWFVDSWSDILFKIWDSFLYEGP 
KVIFRFALALFKYKEEEELKLQDSMSIFKYLRYFT 
RTDLDARSGTDAPTTWRKSGWS 


3166 


A 


10 


4070 


FPGPTISSNSQLYRASALFETIRHEAQLSTDYKLS 
LFDLQTSSYOALQRVLVSLGHHDEALAVAERGR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add rcsidoe of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine CfCysteine, D=Aspartic Acid, 
E=Glutamic Add, F=PhenylaIanine^ G^Glycine, H-Histidine, 
I=Isoleucine, K-Lysioe, Lr=Leudnc, M=Methionine, 
N^Asparagine, P=Prolinc, Q=Glutamine, R^Arginine, S=Serinc, 
T=Threonioe, V=Valine, W='Tryptoplian, Y==Tyrosine, 
X~Unknown, *=Stop codon, possible nudeotide deljction, 
V^possible nndeotide insertion 










TRAFADLLVERQTGQQDSDPYSPVTIDQILEMVN 
GQRGLVLYYSLAAGYLYSWLLAPGAGIVKFHEH 
YLGENTVENSSDFQASSSVTLPTATGSALEQHIAS 
VREALGVESHYSRACASSETESEAGDIMDQQFEE 
MNNKLNSVTDPTGFLRMVRWWL 
LFSNTVSPTQDGTSSLPRRQSSFAKPPLRALYDLL 
lAPMEGGLMHSSGPVGRHRQLILVLEGELYLIPF 
ALLKGSSSNEYLYERFGLLAVPSIRSLSVQSKSHL 
RKNPPTYSSSTSMAAVIGNPKLPSAVMDRWLWG 
PMPSAEEEAYMVSELLGCQPLVGSVATKERVMS 
ALTQAECVHFATfflSWKLSALVLTPSMDGNPASS 
KSSFGHPYTIPESLRVQDDASDGESISDCPPLQEL 
LLTAADVLDLQLPVKLWLGSSQESNSKVAADG 
VIALTRAFLAAGAQCVLVSLWPVPVAAFKMFIH 
AFYSSLLNGLKASAALGEAMKWQSSKAFSHPS 
NWAGFMLIGSDVKLNSPSSLIGQALTEILQHPER 
ARDALRVLLHLVEKSLQRIQNGQRNAMYTSQQS 
VENKVGGBPGWQALLTAVGFRLDPPTSGLPAAV 
FFPTSDPGDRLQQCSSTLQSLLGLPNPALQALCK 
LITASETGEQLISRAVKNMVGMLHQVLVQLQAG 
EKEQDLASAPIQVSISVQLWRLPGCHEFLAALGF 
VLCEVGQEEVILKTGKQA>JRRTVHFALQSLLSLF 
DSTELPKRLSLDSSSSLESLASAQSVSNALPLGYQ 
QPPFSPTGADSIASDAISVYSLSSIASSMSFVSKPE 
GGSEGGGPGGRQDHDRSKNAYLQRSTLPRSQLP 
PQTRPAGNKDEEEYEGFSnSNEPLATYQENRNTC 
^ FSPDIKQPQPGT AGGMRVSVSSKGSISTPNSPVK 
iSdTLIPSPNSPFQkVGKLASSliTGESDQSSTETDST 
VKSQEESNPKLDPQELAQKILEETQSHLIAVERLQ 
RSGGQVSKSNNPEDGVQAPSSTAVFRASETSAFS 
RPVLSHQKSQPSPVTVKPKPPARSSSLPKVSSGYS 
SPTTSEMSIKDSPSOHSGRPSPGCDSOTSOLDOPL 
FKLKYPSSPYSAHISKSPKNMSPSSGHQSPAGSAP 
SPALSYSSAGSARSSPADAPDIDKLKMAAIDEKV 
QAVHNLKMFWQSTPQHSTGPMKIFRGAPGTMTS 
KRDVLSLLNLSPRPNKKEEGVDKLELKELSLQQH 
DGAPPKAPPNGHWRTETTSLGSLPLPAGPPATAP 
ARPLRLPSGNGYKFLSPGRFFPSSKC 


3167 


A 


1 


762 


AARRRQKGKEE^^MMMDLFETGSYFFYLDGENV 

TLQPLEVAEGSPLYPGSDGTLSPCQDQMPPEAGS 

DSSGEEHVLAPPGLQPPHCPGQCLIWACKTCKRK 

SAPTDRRKAATLRERRRLKKINEAFEALKRRTVA 

NPNQRLPKVEILRSAISYIERLQDLLHRLDQQEK 

MQELGVDPFSYRPKQENLEGADFLRTCSSQWPS 

VSDHSRGLVITAKEGGASIDSSASSSLRCLSSIVDS 

ISSEERKLPCVEEWBK 


3168 


A 


701 


246 


TSRRVITVIKFNPFVTSDRSKNRKRHFNAPSH^ 

KIMSSPLSKELRQKYNVRSMPIRKDDEVQVVRG 

HYKGQQIGKVVQVYRKKYVIYIERVQREKANGT 

TVHVGIHPSKVVITRLKLDKDRKKILERKAKSRQ 

VGKEKGKYKEELIEKMQE 


3169 


A 


156 


3168. 


GPGGAISLSVEAKAGADLLVKGKQARMDIYDTQ 
TLGWVFGGFMWSAIGIFLVSTFSMKETSYEEA 
LANQRKEMAKTHHQKVEKKKKEKTVEKKGKT 
KKKEEKPNGKIPDHDPAPNVTVLLREPVRAPAV 



272 



wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=A!anine OOysteine, D=Aspartlc Acid, 
E=GIutamic Acid, F-Phenylalanine, G^tycine, H-Histidinef 
I=Isolcucine, K=Lysine, D=Leudnc, M=Methionine, 
N^Asparagine, Ps^ProIine, Q=Glntamine, R=Arginine, S=Serine, 
T=Thrconine, V=Valine, W=Tryptophan, Y==Tyrosine, 
X"Unknown, *«Stop codon, ^possible nucleotide deletion, 
V=possibIe nucleotide insertion 










AVAPTPVQPPIIVAPVATVPAMPQEKLASSPKDK 

KKKEKKVAKVEPAVSSWNSIQVLTSKAAILETA 

PKEGRNTDVAQSPEAPKQEAPAKKKSGSKKKGP 

PDADGPLYLPYKTLVSTVGSMVFNEGEAQRLM 

LSEKAGnQDTWHKATQKGDPVAILKRQLEEKEK 

LLATEQEDAAVAKSKLRELNKEMAAEKAKAAA 

GEAKVKKQLVAREQEITAVQARMQASYREHVK 

EVQQLQGKIRTLQEQLENGPNTQLARLQQENSIL 

RDALNQATSQVESKQNAELAKLRQELSKVSKEL 

VEKSEAVRQDEQQRKALEAKAAAFEKQVLQLQ 

ASHRESEEALQKRLDEVSRELCHTQSSHASLRAD 

AEKAQEQQQQMAELHSKLQSSEAEVRSKCEELS 

GLHGQLQEARAENSQLTERIRSIEALLEAGQARD 

AQDVQASQAEADQQQTRLKELESQVSGLEKEAI 

ELREAVEQQKVKNNDLREKNWKAMEALATAEQ 

ACKEKLHSLTQAKEESEKQLCLIEAQTMEALLAL 

LPELSVLAQQNYTEWLQDLKEKGPTLLKHPPAP 

AEPSSDLASKLREAEETQSTLQAECDQYRSILAET 

EGMLRDLQKSVEEEEQVWRAKVGAAEEELQKS 

RVTVKHLEEIVEKLKGELESSDQVREHTSHLEAE 

LEKHMAAASAECQNYAKEVAGLRQLLLESQSQL 

DAAKSEAQKQSDELALVRQQLSEMKSHVEDGDI 

AGAPASSPEAPPAEQDPVQLKTQLEWTEAILEDE 

QTQRQKLTAEFEEAQTSACRLQEELEKLRTAGPL 

ESSETEEASQLKERLEKEKKLTSDLGRAATRLQE 

LLKTTQEQLAREKDTVKXLQEQLEKAEDGSSSK 

EGTSV 


3170 


A 


6730 


4027 


THASEKYSYGHLPTHSITAHPMVriKlSUKl^Ki.iv^ 

PYIHNYSWLLFAALALYSAHLASAEDVDGEKLD 

PQTRSSATTLRSQCMQLVGDCLMKAHQGKGLK 

ALALLGVLPDGDSSLEDHALPVTVPTGASEEQLE 

KKAVQGAELSEAGNGKRAVHEEIRPVDFKQRNK 

ADKGVSLSKDPSCQTQISDSPADASPPTGLPDAE 

DSEVSSQKPIEEKAVTPSPEQVFAECSQKRILGLL 

AAMLPPLKSGPTVPLIDLEHVLPLMFQVVISNAG 

HLNETYHLTLGLLGQLIIRLLPAEVDAAVIKVLSA 

KHNLFAAGDSSIVPDGWKTTHLLFSLGAVCLDS 

RVGLDWACSMAEBLRSLNSAPLWRDVIATFTDH 

CIKQLPFQLKHTNIFTLLVLVGFPQVLCVGTRCV 

YMDNANEPHNVIILKHFTEKNRAVIVDVKTRK^ 

KTVKDYQLVQKGGGQECGDSRAQLSQYSQHFA 

FIASHLLQSSMDSHCPEAVEATWVLSLALKGLY 

KTLKAHGFEEDRATFLQTDLLKLLVKKCSKGTGF 

SKTWLLRDLEILSIMLYSSKKEINALAEHGDLEL 

DERGDREEEVERPVSSPGDPEQKKLDPLEGLDEP 

TRICFLMAHDALNAPLHE.RAIYELQMKKTDYFF 

LEVQKRFDGDELTTDERIRSLAQRWQPSKSLRLE 

EQSAKAVDTDMIILPCLSRPARCDQATAESNPVT 

QKLISSTESELQQSYAKQRRSKSAALLHKELNCK 

ovT> A\7T>nvT TOVMPATAVT.YARHVLASLLAEWP 

SHVPVSEDELELSGPAHMTYILDMFMQLEEKHE 

WEKWMQTELVLTHQVLPLPHRLPPVSASWSEA 

TCVAVQLPDRCECSKGRVTVSSPKDWASEELRG 

PERDFOLNQKALSPSSQFPSAEELRHIR 


3171 


A 


557 


89 


GTRAGPVKDREAFORLNFLYQAAHCVLAQDPliN 
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S£QID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to flrst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«=AIanjne C=Cysteinc, D=Aspartic Acid, 
£=Glutamic Acid, F=PhenyIa!anlne, GK^lycine, H=Histidine, 
I==Isoleucine, K=Lysine, L=Lcucinc, M=Metfaionine, 
N=Asparaginc, P=ProIlne, Q=G!utamine, R=Arginine, S=Serinc, 
T=Thrconine, V=VaJinc, W=Tryptophan, Y=Tyroslnc, 
X^^Unknown, *»Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










QALARFYCYTERTIAKRLVLRRDPSVKRTLCRGC 
SSLLVPGLTCTQRQRRCRGQRWTVQTCLTCQRS 
QRPLNDPGHLLWGDRPEAQLGSQADSKPLQPLP 
NTAHSISDRLPEEKMQTQGSSNQ 


3172 


A 


2 


496 


FRRAGAGRGRRRGEVTSPLSPEPLAFQSLATSRR 

PEPQTTQTVRSSALPAPPASPMSQYAPSPDFKRA 

LDSSPEANTEDDKTEEDVPMPKNYLWLTIVSCFC 

PAYPINT/ALVFSIMSLNSYNDGDYEGARRLGRN 

AKWVAIASIIIGLLIIGISCAVHFTRNA 


3173 


A 


2 


4048 


FRSGGCRRRAWTSRWPQRRRSPESCEAPLSAPL 

WGPQRGLPGREPLRSRSASAIALRTIGHILALLLR 

LLHLGLGSGGCREDWPSGRGKKEEBCMKKHRRA 

LALVSCLFLCSLVWLPSWRVCCKESSSASASSYY 

SQDDNCALENEDVQFQKKDEREGPINAESLGKS 

GSNLPISPKEHKLKDDSIVDVQNTESKKLSPPVVE 

TLPTVDLHEESSNAVVDSETVENISSSSTSEITPIS 

KLDEEEKSGTIPIAKPSETEQSETDCDVGEALDAS 

APIEQPSFVSPPDSLVGQHIENVSSSHGKGKITKSE 

FESKVSASEQGGGDPKSALNASDNLKNESSDYT 

KPGDroPTSVASPKDPEDIPTFDEWKKKVMEVEK 

EKSQSMHASSNGGSHATKKVQKNRNNYASVEC 

GAKILAANPEAKSTSAILIENMDLYMLNPCSTKI 

WFVIELCEPIQVKQLDIANYELFSSTPKDFLVSISD 

RYPTNKWIKLGTFHGRDERNVQSFPLDEQMYAK 

YVKMFIKYIKVELLSHFGSEHFCPLSLIRVFGTSM 

VEEYEEIADSQYHSERQELFDEDYDYPLDYNTGE 

DKSSKNLLGSATNAILNMVNIAAI^ 

EGT^SisiENAfATA'APKMPESTPVSTPW^ 

TEVHTHDMEPSTPDTPKESPIVQLVQEEEEEASPS 

TVTLLGSGEQEDESSPWFESETQEFCSELTTICCIS 

SFSEYIYKWCSVRVALYRQRSRTALSKGKDYLV 

LAQPPLLLPAESVDVSVLQPLSGELENTNIEREAE 

TVVLGDLSSSMHQDDLVNHTVDAVELEPSHSQT 

LSQSLLLDITPEINPLPKIEVSESVEYEAGHIPSPVI 

PQESSVEEDNETEQKSESFSSIEKPSITYETNKVNE 

LMDNIIKEDVNSMQIFTKLSETIWPINTATVPDN 

EDGEAKMNIADTAKQTLISVVDSSSLPEVKEEEQ 

SPEDALLRGLQRTATDFYAELQNSTDLGYANGN 

LVHGSNQKESVFMRLNNRIKALEVNMSLSGRYL 

EELSQRYRKQMEEMQKAFNKITVKLQNTSRIAE 

EQDQRQTEAIQLLQAQLTNMTQLVSNLSATVAE 

LKREVSDRQSYLVISLVLCVVLGLMLCMQRCRN 

TSQFDGD YISKLPKSNQYPSPKRCFSSYDDMNLK 

RRTSFPLMRSKSLQLTGKEVDPNDLYIVEPLKFSP 

EKKKKRCKYKffiKIETIKPEEPIJElPIANGDIKGRK 

PFTNQRDFSNMGEVYHSSYKGPPSEGSSETSSQS 

EESYFCGISACTSLCNGQSQKTKTEKRALKRRRS 

KVQDQGKLIKTLIQTKSGSLPSLHDIIKGNKEITV 

GTFGVTAVSGHI 


3174 


A 


485 


4668 


RKCSKEKASKTPSQKIPTTPCCVLQAGPEPRSLAE 
RMGADGETWLKNMLIGVNLILLGSMIKPSECQL 

EVTTERVQRQSVEEEGGUNYNTSSKEQPWFNH 
VYNINVPLDNLCSSGLEASAEQEVSAEDETLAEY 
MGQTSDHESQVTFTHRINFPKKACPCASSAQVLQ 
ELLSRIEMLEREVSVLRDQCNANCCQESAATGQL 
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SEQOO 
NO: 



Method 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E>=Glutamic Acid. F-Phenylalanine, G=Glydne. H^Histidine, 
Msoleucine, K=LysiDe, L^Leucine, M»Methionine, 
N-Asparaginc,P^ProUne, Q=<;!utaraine, R=Argininc, S^Serine, 
T^Threonine, V=Va!ine, W»^Tryptoptian, Y-Tyrosine, 
X*=Unluiown, *«=Stop codon, possible nucleotide deletion, 
V>=possibIe nucleotide insertion 



3175 



3176 



DYIPHCSGHGNFSFESCGCICNEGWFGKNCSEPY 
CPLGCSSRGVCVDGQCICDSEYSGDDCSELRCPT 
DCSSRGLCVDGECVCEEPYTGEDCRELRGPGDCS 
GKGRCANGTCLCEEGYVGEDCGQRQCLNACSG 
RGQCEEGLCVCEEGYQGPDCSAVAPPEDLRVAG 
ISDRSIELEWDGPMAVTEYVISYQPTALGGLQLQ 
QRVPGDWSGVTrrELEPGLTYNISVYAVISNILSL 
PITAKVATHLSTPQGLQFKTITETTVEVQWEPFSF 
SFDGWEISFIPKNNEGGVIAQVPSDVTSFNQTGLK 
PGEEYTVNWALKEQARSPPTSASVSTVIDGPTQI 
LVRDVSDTVAFVEWIPPRAKVDFELLKYGLVGGE 
GGRTTFRLQPPLSQYSVQALRPGSRYEVSVSAVR 
GTNESDSATTQFTTEIDAPKNLRVGSRTATSLDL 
EWDNSEAEVQEYKVVYITLAGEQYHEVLVPRGI 
GPTTRATLTDLVPGTEYGVGISAVMNSQQSVPAT 
MNARTELDSPRDLMVTASSETSISLIWTKASGPID 
HYRITFTPSSGIASEVTVPKDRTSYTLTDLEPGAE 
YnSVTAERGRQQSLESTVDAFTGFRPISHLHFSH 
VTSSSVNITWSDPSPPADRLILNYSPRDEEEEMME 
VSLDATKRHAVLMGLQPATEYIVNLVAVHGTVT 
SEPIVGSITTGIDPPKDITISNVTKDSVMVSWSPPV 
ASFDYYRVSYRPTQVGRLDSSVVPNTVTEFTITR 
LNPATEYEISLNSVRGREESERICTLVHTAMDNP 
VDLIATMTPTEALLQWKAPVGEVENYVIVLTHF 
AVAGEHLVDGVSEEFRLVDLLPSTHYTATMYAT 
NGPLTSGTISTNFSTLLDPPANLTASEVTRQSALIS 
WQPPRAEIENYVLTYKSTDGSRKELIVDAEDTWI 
RLEbLLENTOYmLQAAQDTTWSSITST^ 
GRVFPHPQDCAQHLMNGDTLSGVYPIFLNGELS 
QKLQVYCDMTTDGGGWIVFQRRQNGQTDFFRK 
WADYRVGFGNVEDEFWLGLDNIHRITSQGRYEL 
RVDMRDGQEAAFASYDRFSVEDSRNLYKLRIGS 
YNGTAGDSLSYHQGRPFSTEDRDNDVAVTNCA 
MSYKGAWWYKNCHRTNLNGKYGESRHSQGIN 
WYHWKGHEFSIPFVEMKMRPYNHRLMAGRKRQ 
SLQF 



623 



99 



1567 



RLQLPACPALSAAHPLALPSFSSQCHRAEARAAA 
AATAEGTMASGVTVNDEVIKVFNDMKVRKSST 
QEEIKKRKKAVLFCLSDDKRQir/EEAKQILVGDI 
GDTVEDPYTSFVKLLPLNDCRYALYDATYETECE 
SKKEDLVFIFWAPESAPLKSKMIYASSKDAIKKK 
FTGIKHEWQVNGLDDIKDRSTLGEBCLGGNVVVS 

LEGKPL 



PRGCWSSCLDAMFRLNSLSALAELAVGSRWYH 

GGSQPIQIRRRLMMVAFLGASAVTASTGLLWKR 

AHAESPPCVDNLKSDIGDKGKNKDEGDVCNHEK 

KTADLAPHPEEKKKKRSGFRDRKVMEYENRIRA 

YSTPDKIFRYFATLKVISEPGEAEVFMTPEDFVRS 

ITPNEKQPEHLGLDQYIIKRFDGKTEKISQEREKF 

ADEGSIFYTLGECGLISFSDYIFLTTVLSTPQRNFE 

lAFKMFDLNGDGEVDMEEFEQVQSHRSQTSMG 

MREIRDRPTTGim,KSGLCSALTTYFFGADLKGK 

LTKNFLEFQRKLQHDVLKLEFERHDPVDGRITE 

RQFGGMLLAYSGVQSKKLTAMQRQLKKHFKEG 

KGLTFQEVECTFTFLKNI>nDVDTALSFYH]^ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Add, 
E>=Glutamic Acid, F^Fhenylalanine, G»Glycine, H^Histidine, 
I=IsoIeucine, K«Lysine, I^Leucine, M'=Methionine, 
N^Asparagine, P»Proline, Q=Glutamine, R^Arginine, S=Serine, 
T-Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X-UnknowDy *eStDp codon, /possible nucleotide deletion, 
V==possible nucleotide insertion 










LDKVTMQQVARTVAKVELSDHVCDWFALFDC 
DGNGELSNKEFVSMKQRLMRGLEKPKDMGFTR 
LMQAMWKCAQETAWDFALPKQ 


3177 


A 


182 


648 


LGWGSGAAVGGRQAARGAALGRRPMAAVLG 

ALGATRRLLAALRGQSLGLAAMSSGTHRLTAEE 

RNQAILDLKAAGWSELSERDAIYKEFSFHNFNQA 

FGFMSRVALQAEKMNHHPEWFNVYNKVQITLTS 

HDCGELTKKDVKLAKFIEKAAASV 


3178 


A 


8 


612 


ACGCRSFCGSTVMSLLLYYALPALGSYAMLSIFF 

LRRPHLLHTPRAPITRIRLGAHRGGSGELLENTM 

EAMENSMAQRSDLLELDCQLTRDRVVVVSHDE 

NLCRQSGLNRDVGSLDFEDLPLYKEKLEVYFSPG 

HFAHGSDRRMVRLEDLFQRFPRTPMSVEIKGKN 

EELIREIAGLVRRYDRNEinWASEKSSVMKKCK 


3179 


A 


88 


1496 


QETSKMETLSFPRYNVAEIVIHIRNKILTGADGKN 
LTKNDLYPNPKPEVLHMIYMRALQIVYGIRLEHF 
YMMPVNSEVMYPHLMEGFLPFSNLVTHLDSFLPI 
CRVm)FETADILCPKAJKJlTSRFLSGIIOTIHFREAC 
RETYMEFLWQYKSSADKMQQLNAAHQEALMK 
LERLDSVPVEEQEEFKQLSDGIQELQQSLNQDFH 
QKTIVLQEGNSQKKSNISEKTKRLNELKLSVVSL 
BCEIQESLKTKJVDSPEKLKhryXEKMKDTVQKLK 
NARQEWEKYEIYGDSVDCLPSCQLEVQLYQKK 
IQDLSDNREKLASILPCESLNLEDQIESDESELKKL 
KTEENSFKRLMIVKKEKLATAQFKINnKXm 
QYKRTVBBDCNKVQEKRGAVYERVTTINHErQKI 
imGIQQbKDAADREKLKSQEIFLNLKTAL^^ 
GmKAAEDSYAKTOEKTAELKRKMFm^ 


3180 


A 


298 


7086 


GNMACWPQLRLLLWKNLTFRRRQTCQLLLEVA 

WPLFIFLILISVRLSYPPYEQHECHFPNKAMPSAG 

TLPWVQGnCNANNPCFRYPTPGEAPGVVGNFNK 

SIVARLFSDARRLLLYSQKDTSMKDMRKVLRTL 

QQIKKSSSNLKLQDFLVDNETFSGFLYHNLSLPK 

STVDKMLRADVILHKVFLQGYQLHLTSLCNGSK 

SEEMIQLGDQEVSELCGLPREKLAAAERVLRSN 

MDILKPILRTLNSTSPFPSKELAEATKTLLHSLGT 

LAQELFSMRSWSDMRQEVMFLTNVNSSSSSTQI 

YQAVSRIVCGHPEGGGLKIKSLNWYEDNNYKAL 

FGGNGTEEDAETFn)NSTTPYCNDLMKNLESSPL 

SRnWKALKPLLVGKILYTPDTPATRQVMAEVNK 

TFQELAWHDLEGMWEELSPKIWTFMENSQEMD 

LVRMLLDSRDNDHFWEQQLDGLDWTAQDIVAF 

LAKHPEDVQSSNGSVYTWREAFNETNQAIRTISR 

FMECVNLNKLEPIATEVWLINKSMELLDERKFW 

AGXVFTGrrPGSmLPHHWYKIRMGmNVERTNK 

IKDGYWDPGPRADPFEDMRYVWGGFAYLQDW 

EQAmVLTGTEKKTGVTMQQMPYPCYVDDIFLR 

VMSRSMPLFMTLAWIYSVAVIIKGIVYEKEARLK 

ETMRIMGLDNSILWFSWFISSLIPLLVSAGLLWI 

LKLGNLLPYSDPSWFVFLSVFAWTILQCFLIST 

U?SRAmAAACGGnYFTLYLPYVLCVAWQDYV 

GFTLKIFASLLSPVAFGFGCEYFALFEEQGIGVQW 

DNLFESPVEEDGFNLTTSVSMMLFDTFLYGVMT 

WYIEAVFPGQYGIPRPWYFPCTKSYWFGEESDEK 

SHPGSNQKRISEICMEEEPTHLKLGVSIQNLVKVY 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A^Alanlne OCysteine, I>=Aspartlc Acid, 
EraGlutamic Acid, F-Phenylalanlne, G^Glyclne, H^Histidine, 
I«Isoleudne, K^Lysine, L^Leucine, M^Methionine, 
N^Asparagine, P^Proline, Q=Glutamine, R^Arginine, S=Serine, 
T-Threonine, V«Valine, W=Tiyptophan, Y=Tyrosine, 
X-Unknown, *'=Stop codon, /^possible nucleotide deletion, 
>Fpossibie nudeotide insertion 










RDGMKVAVDGLALNFYEGQITSFLGHNGAGKTT 

TMSILTGLFPPTSGTAYILGKDIRSEMSTIRQNLG 

VCPQHNVLFDMLTVEEHIWFYARLKGLSEKHVK 

AEMEQMALDVGLPSSKLKSKTSQLSGGMQRKLS 

VALAFVGGSKWILDEPTAGVDPYSRRGIWELLL 

KYRQGRinLSTHHMDEADVLGDRIAnSHGKLCC 

VGSSLFLKNQLGTGYYLTLVKKDVESSLSSCRNS 

SSTVSYLKKEDSVSQSSSDAGLGSDHESDTLTID 

VSAISNLIRKHVSEARLVEDIGHELTYVLPYEAA 

KEGAFVELFHEIDDRLSDLGISSYGISETTLEEIFL 

KVAEESGVDAETSDGTLPARRNRRAFGDKQSCL 

RPFTEDDAADPNDSDIDPESRETDLLSGMDGKGS 

YQVKGWKLTQQQFVALLWKRLLIARRSRKGFF 

AQIVLPAVFVCIALVFSLIVPPFGKYPSLELQPWM 

YNEQYTFVSNDAPEDTGTLELLNALTKDPGFGT 

RCMEGNPIPDTPCQAGEEEWTTAPVPQTIMDLFQ 

NGNWTMQNPSPACQCSSDKIKKMLPVCPPGAGG 

LPPPQRKQNTADILQDLTGKKISDYLVKTYVQIIA 

KSLKNKIWVNEFRYGGFSLGVSNTQALPPSQEV 

NDATKQMKKHLKLAKDSSADRFLNSLGRFMTG 

LDTRNNVKVWFNNKGWHAISSFLNVINNAILRA 

NLQKGENPSHYGITAFNHPLNLTKQQLSEVAPM 

TTSVDVLVSICVIFAMSFVPASFVVFLIQERVSKA 

KHLQFISGVKPVIYWLSNFVWDMCNYVVPATLV 

mFICFQQKSYVSSTOLPVLALLLLLYGWSIlPLM 

YPASFVFKIPSTAYWLTSVNLFIGINGSVATFVL 

ELFTDNKLNNINDILKSVFLIFPHFCLGRGLIDMV 

KNQAMADALERFGENRFVSPLSWDLVGRNLFA 

MAVEGVVFFLITVLIQYRFFIRPRPVNAKLSPLND 

EDEDVRRERQRILDGGGQM)ILEIKELTKIYRRK 

RKPAVDRICVGIPPGECFGLLGVNGAGKSSTFKM 

LTGDTTVTRGDAFLNKNSILSNIHEVHQNMGYCP 

QFDAITELLTGREHVEFFALLRGVPEKEVGKVGE 

WAIRKLGLWYGEKYAGNYSGGNKRKLSTAMA 

LIGGPPVVFLDEPTTGMDPKARRFLWNCALSW 

KEGRSVVLTSHSMEECEALCTRtvlAIMVNGRFRC 

LGSVQHLK]^GDGYTIVVRIAGSNPDLKPVQDF 

FGLAFPGSWKEKHRNMLQYQLPSSLSSLARIFSI 

LSQSKKRLHIEDYSVSQTTLDQVFVNFAKDQSDD 

DHLKDLSLHKNQTVVDVAVLTSFLQDEKVKESY 

V 


3181 


A 


215 


1367 


PPATSQAALPEALSKGRETPRPATHPARSQDVRP 

LSCPFDFLRDKVEWSEEQAAAAERKVQENSIQR 

VCQEKQVDYEINAIKYWl^roFVrKIHENGFFKDR 

HWLFTEFPEIJ^PSQNQNHOCDWFLENKSEVPEC 

R2WEDGPGLIMEEQHKCSSKSLEHKTQTPPVEEN 

VTQKISDLEICADEFPGSSATYRILEVGCGVGNTV 

FPILQTNNDPGLFVYCCDFSSTAIELVQTNSEYDP 

SRCFAFVHDLCDEEKSYPVPKGSLDIULIFVLSAl 

1 mr\TJ'\ Ar\Xr A TKTDT OT>T T VX>nf^\Jf\r[ T "PT^VTrTJ VTiM 

VPDKMQKAlNKJ-»bKJjlrfJNJrOUM i vjjvi 
AQLRFKKGQCLSGNFYVRGDGTRVYFFTQEELD 
TLFTTAGLEKVQNLVDRRLQVNRGKQLTMYRV 
WIQCKYCKPLLSSTS 


3182 


A 


3 


1289 


■ " GSETQHLPRDPQHLPWDPQQHQDRRRPELFHAF 
ARDSAPPPSMVLAAETTSQQERLQAIAEKRKRQ 
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SEQIB 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, D^Aspartic Acid, 
£=Glutamic Acid, F^Pbenylalanine, G=Glycinc H-Histidine, 
I=IsoIeudne, K=Lysine, L=Leucine, M=Metbionine, 
N=Asparagine, P=ProliDC, Q=GIutamine, R=Arginine, S»Serine, 
T'^Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X^Unknowo, *^top codon, possible nucleotide deletion, 
\=possible nucleotide insertion 










AEIENKRRQLEDERRQLQHLKSKALRERWLLEG 

TPSSASEGDEDLRRQMQDDEQKTRLLEDSVSRLE 

KGIEVLERGDSAPAAAKENAAAPSPVRAPAPSPA 

KEERKTEVVMNSQQTPVGTPKDKRVSNTPLRTV 

DGSPMMKAAMYSVEITVEKDKVTGETRVLSSTT 

LLPRQPLPLGIKVYEDETKVVHAVDGTAENGIHP 

LSSSEVDELIHKADEVTLSEAGSTAGAAETRGAV 

EGAARTTPSRREITGVQAQPGEATSGPPGIQPGQE 

PPVTMIFMGYQNVEDEAETKKVLGLQDTITAEL 

WIEDAAEPKEPAPPNGSAAEPPTEAASREENQA 

GPEATTSDPQDLDMKKHRCKCCSIM 


3183 


A 


333 


1931 


lAPTGGSHSEIQKQLGSGGDSSSQRRAERRTEPRS 

APRPRWGRSARSPGAHKLPGPPRRRDPGAWARL 

EAAAAHRHSRGSMGRRMRGAAATAGLWLLAL 

GSLLALWGGLLPPRTELPASRPPEDRLPRRPARS 

GGPAPAPRFPLPPPLAWDARGGSLKTFRALLTLA 

AGADGPPRQSRSEPRWHVSARQPRPEESAAVHG 

GVFWSRGLEEQVPPGFSEAQAAAWLEAARGAR 

MVALERGGCGRSSNRLARFADGTRACVRYGINP 

EQIQGEALSYYLARLLGLQRHVPPLALARVEAR 

GAQWAQVQEELRAAHWTEGSWSLTRWLPNLT 

DVWPAPWRSEDGRLRPLRDAGGELANLSQAEL 

VDLVQWTDLILFDYLTANFDRLVSNLFSLQWDP 

RVMQRATSNLHRGPGGALVFLDNEAGLVHGYR 

VAGMWDKYNEPLLQSVCVFRERTARRVLELHR . 

GQDAAARLLRLYRRHEPRFPELAALADPHAQLL 

QRRIX)FLAKHILHCKAKYGRRSGDLySPGGKER. 

DLGLGYG " " ' • 


3184 


A 


1 


1004 


GSTHASADAWAQWFCTEALVMGAPVWYLVAA 

ALLVGFILFLTRSRGRAASAGQEPLHNEELAGAG 

RVAQPGPLEPEEPRAGGRPRRRRDLGSRLQAQR 

RAQRVAWAEADENEEEAVILAQEEEGVEKPAET 

HLSGKIGAKKLRKLEEKQARKAQREAEEAEREE 

RKRLESQREAEWKKEEERLRLEEEQKEEEERKA 

REEQAQREHEEYLKLKEAFVVEEEGVGETMTEE 

QSQSFLTEFINYIKQSKVVLLEDLASQVGLRTQD 

TINRIQDLLAEGTITGVIDDRGKFmTPEELAAVA 

NFIRQRGRVSIAELAQASNSLIAWGRESPAQAPA 


3185 


A 


2981 


7173 


CLLAGKFSSTLYETGGCDMSLVNFEPAARRASNI 

CDTDSHVSSSTSVRFYPHDVLSLPQIRLNRLLTBD 

TDLLEQQDIDLSPDLAATYGPTEEAAQKVKHYY 

RF\\aLPQLWIGINFDRLTLLALFDRNREILENVLA 

VILAJLVAFLGSILLIQGFFRDIWVFQFCLVIASCQ 

YSLLKSVQPDSSSPRHGHNRIIAYSRPVYFCICCG 

LIWLLDYGSRNLTATTCFKLYGITFTNPLVFISARD 

LVIWTLCFPr/FFIGLLPQVNTFVMYLCEQLDIHI 

FGGNATTSLLAALYSFICSIVAVALLYGLCYGAL 

KDSWDGQHIPVLFSIFCGLLVAVSYHLSRQSSDP 

SVLFSLVQSKIFPKTEEKNPEDPLSEVKDPLPEKL 

RNSVSERLQSDLWCIVIGVLYFAIHVSTVFTVLQ 

PALKYVLYTLVGFVGFVTHYVLPQVRKQLPWH 

CFSHPLLKTLEYNQYEVRNAATMMWFEKLHVW 

LLFVEKOTYPLIVLNELSSSAETIASPKKLNTELG 

ALMXTVAGLKLLRSSFSSPTYQYVTVIFTVLFFKF 

DYEAFSETMLLDLFFMSILFNKLWELLYKLQFVY 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^^Aianioe C=Cysteine, D«Aspartic Add, 
E=GIutamic Acid, F'=FbeoylaIanine, G^GIycine, H^^Histidine, 
I«Isolcudnc K=Lysine, I^Leudne, M=Methionine, 
N=Asparaginc, P^Proline, Q=GIutamine, R=Argininc, S^Serine, 
T^Thrcomnc, V=VaHne, W=»Tryptophan, Y^Tyrosine, 
X=XJnknown, *««top codon,/=possible nudcotide ddetion, 
X^ossible nucleotide insertion 










TYIAPWQITWGSAFHAFAQPFAVPHSAMLFIQAA 

VSAFFSTPLKPFLGSAIFITSYVRPVKFWERDYNT 

KRVDHSNTRLASQLDKNPGTYCQQREVEAITEG 

VEEDEGFCCCEPGHIPHMLSFNAAFSQRWLAWE 

VIVmmEGYSrroNSAASMLQWDLRKVLTTY 

YVKGIIYYVTTSSKLEEWLANETMQEGLRLCAD 

RNYVDVDPTFNPNIDEDYDHRLAGISRESFCVIY 

LNWIEYCSSRRAKPVDVDKDSSLVTLCYGLCVL 

GRRALGTASHHMSSNLESFLYGLHALFKGDFRIS 

SIRDEWIFADMELLRKVVVPGIRMSIKLHQDHFT 

SPDEYDDPTVLYEAIVSHEKNLVIAHEGDPAWRS 

AVLANSPSLLALRHVMDDGTNEYKIIMLNRRYL 

SFRVIKVNKECVRGLWAGQQQELVFLRNRNPER 

GSIQNAKQALRNMINSSCDQPIGYPIFVSPLTTSY 

SDSHEQLKDILGGPISLGNIRNFIVSTWHRLRKGC 

GAGCNSGGNIEDSDTGGGTSCTGNNATTANNPH 

SNVTQGSIGNPGQGSGTGLHPPVTSYPPTLGTSHS 

SHSVQSGLVRQSPARASVASQSSYCYSSRHSSLR 

MSTTGFVPCRRSSTSQISLRNLPSSIQSRLSMVNQ 

MEPSGQSGLACVQHGLPSSSSSSQSIPACKHHTL 

VGFLATEGGQSSATDAQPGNTLSPANNSHSRKA 

EVIYRVQIVDPSQILEGINLSKRKELQWPDEGIRL 

KAGRKSWKDWSPQEGMEGHVIHRWVPCSRBPG 

TRSHIDKAVLLVQIDDKYVTVIETGVLELGAEV 


3186 


A 


3 


470 


SLSAMRFLAATFLLLALSTAAQAEPVQFKDCUSV 

DGVIKEVNVSPCPTQPCQLSKGQSYSVNVTFTSN 

IQSKSSKAVVHGILMGVPVPFPIPEPDGGKSGINC 

PIQKi)kTYSYLNKLPVKSEYPSIKLVVEWQLQDD 

KNOSLFCWEIPVQIVSHL 


3187 


A 


3 


470 


SLSAMRFLAATFLLLALSTAAQAEPVQFKUCUbV 

DGVKEVNVSPCPTQPCQLSKGQSYSVNVTFTSN 

IQSKSSKAWHGILMGVPVPFPIPEPDGCKSGINC 

PIQKDKTYSYLNKLPVKSEYPSIKLWEWQLQDD 

KNOSLFCWEIPVQIVSHL 


3188 


A 


2 


3483 


PRVRTKLn.LVNDKKRYERVGGGPKKLGRDVEM 

EEMIEQLQEKVHELEKQNDTLKNRLISAKQQLQT 

QGYRQTPYNNVQSRINTGRRKANBNAGLQECPR 

KGIKFQDADVAETPHPMFTKYGNSLLEEARGEIR 

NLENVIQSQRGQIEELEHLAEILKTQLRRKENEIE 

LSLLQLREQQATDQRSNIRDNVEMIKLHKQLVE 

KSNALSAMEGKFIQLQEKQRTLKISHDALMANG 

DELNMQLKEQRLKCCSLEKQLHSMKFSERRIBEL 

QDRINDLEKERELLKENYDKLYDSAFSAAHEEQ 

WKLKEQQLKVQL^QLETALKSDLTDKTEILDRL 

KTERDQNEKLVQENRELQLQYLEQKQQLDELKK 

RIKLYNQENDINADELSEALLLIKAQKEQKNGDL 

SFLVKVDSEINKDLERSMRELQATHAETVQELEK 

TRNMLIMQHKINKDYQMEVEAVTRJKMENLQQD 

YELKVEQYVHLLDIRAARIHKLEAQLKDL\YGTK 

rkVTrwPT7T\>n>nn^vnPTOETIHLERGENLFEIHIN 

KVTFSSEVLQASGDKEPVTFCTYAFYDFELQTTP 

WRGLHPEYNFTSQYLVHVNDLFLQYIQKNTITL 

EVHQAYSTEYETIAACQLKFHEILEKSGRIFCTAS 

LIGTKGDIPNFGTVEYWFRLRVPMDQAIRLYRER 

AKALGYITSNFKGPEHMOSLSOQAPKTAQLSSID 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

correspondiog 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A==Alanine OCysteine, INAspartic Acid, 
E^lutamic Acid, F=Phenylalanine» G=G)ycine, H=Histidine, 
I=Isoleudne, K=Lysine, L^Leudne^ M^Methionine, 
N=Asparaginc, P=ProIine, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W«Tiyptophan, Y^Tyrosinc, 
X-Unknown, *^top codon, A^ossible ondeotide deletion, 
^possible nudeotide insertion 










STDGNLNELHITIRCCNHLQSRASHLQPHPYWY 

KFFDFADHDTAUPSSNDPQFDDHMYFPVPMNM 

DLDRYLKSESLSFYVFDDSDTQENIYIGKVNVPLI 

SLAHDRCISGIFELTDHQKHPAGTIHVILKWKFA 

YLPPSGSnTEDLGNFmSEEPEWQRLPPASSVST 

LVLAPRPKPRQRLTPVDKKVSFVDIMPHQSDVSQ 

EGSVDEVKENTEKMQQGKDDVSLLSEGQLAEQS 

LASSEDETEITEDLEPEVEEDMSASDSDDCnPGPI 

SKNIKQPSEKIRIEIIALSLNDSQVTMDDTIQRLFV 

ECRFYSLPAEETPVSLPKPKSGQWVYYNYSNVIY 

VDKENNKAKIU:>n.KAILQKQEMPNRSLRFTVVS 

DPPEDEQDLECEDIGVAHVDLADMFQEGRDLIE 

QNIDVFDARADGEGIGKLRVTVEALHALQSVYK 

QYRDDLEA 








1 17S 

li rJ 


MKGSGWHLRSGMVGTLTTT7LPHWRRTAHVGTN 

ILTAVSYLKGLWMECVWHSTGIYQCQIYRSLLA 

LPQDLQAARALMGISCLLSGIACACAVIGMKCTR 

CAKGTPAKTTFAILGGTLFILAGLLCMGAVSWTT 

NDVVQNFYNPLLPSGMKFEIGQALYLGFISSSLSL 

IGGTLLCLSCQDEAPYRPYQAPPRATnTANTAP 

AYQPPAAYKDNRAPSVTSATHSGYRLNDYV 


3190 


A 


267 


1037 


DRMAWQGLVLAACLLMFPSTTADCLSRCSLCA 
VKTODGPKPTNPLTC55LOCOAALLPSEEWERCOSF 
LSFFTPSTLGLNDKEDLGSKSVGEGPYSELAKLS 
GSFLKELEKSKFLPSISTKENTLSKSLEEKLRGLS 
DGFREGAESELMRDAQLNDGAMETGTLYLAEE 
^BPKEQVKRVbGFLRKYPKRSSEVAGEGDGDSM 
GIffiDLYldiYGGFLRRIRPKLKWDNQKRYGGFLR 
ROFKVVTRSOEDPNAYSGELFDA 


3191 


A 


29 


574 


GTSAGAQTKGALCQLKVPTEKLPSPLPTMADEID 

FTTGDAGASSTYPMQCSALRKNGFVVLKGRPCK 

IVEMSTSKTGKHGHAKVHLVGIDIFTGKKYEDIC 

PSTHNMDVPNIKR>JDYQLICIQDGYLSLLTETGE 

VREDLKLPEGELGKEIEGKYNAGEDVQVSVMCA 

MSEEYAVAIKPCK 


3192 


A 


105 


1661 


KVSADGMQSCESSGDSADDPLSRGLRRRGQPRV 

WIGAGLAGLAAAKALLEQGFTDVTVLEASSHIG 

GRVQSVKLGHATFELGATWIHGSHGNPIYHLTE 

ANGLLEETTDGERSVGRISLYSKNGVACYLTNH 

GRRIPKDWEEFSDLYNEVYNLTQEFFRHDKPVN 

AESQNSVGVFTREEVRNRIRNDPDDPEATKRLKL 

AMIQQYLKVESCESSSHSMDEVSLSAFGEWTEIP 

GAHHUPSGFMRWELLAEGIPAHVIQLGKPVRCI 

HWDQASARPRGPEIEPRGEGDHNHDTGEGGQGG 

EEPRGGRWDEDEOWSVWECEDCELIPADHVIV 

TVSLGVLKRQYTSFFRPGLPTEKVAAIHRLGIGTT 

DKTFLEFEEPFWGPECNSLQFVWEDEAESHTLTY 

PPELWYRKICGFDVLYPPERYGHVLSGWICGEEA 

LVMEKCDDEAVAEICTEMLRQFTGNPNIPKPRRI 

LRSAWGSNPYFRGSYSYTQVGSSGADVEKLAKP 

LPYTESSKTATK 


3193 


A 


1 


1928 


QLGTRRCLRGDKVTNAMQDFLVTNLEPRFIEPQT 
ANLSWFKDSNSTTPLEFVLSPGTDPAADLYKFA 
EEMKFSKKLSAISLGQGQGPRAEAMMRSSIERGK 
WVFFQNCHLAPSWMPALERLIEHINPDKVHRDF 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine OCysteine, D-Aspartic Add, 
E>^lutaniic Acid, F— Phenylalanine, G^lycine, H<=Histidine, 
Isisoleudne, K-Lysine, L»Leudne, M'Methionine, 
N=Asparagine, P^ProIinc, Q=Glutaniine, R==Arginine, S=Scrine, 
T^Tbreonine, V=VaIine, W=Tryptophan, Y==Tyrosine, 
X»Unknown, *=Stop codon, ^^possible nudeotide deletion, 
\-possible nudeotide insertion 










rlwltslpsnkfpvsilqngskmtiepprgvran 

llksysslgedflnschkvmefkslllslclfhg 

nalerrkfgplgfnipyeftdgdlricisqlkmfl 

deyddipykvlkytageinyggrvtddwdrrci 

mniledfynpdvlspehsysasgiyhqipptydlh 

gylsyikslplndmpeifglhdnamtfaqnetfa 

llgthqlqpksssagsqgreeivedvtqnillkvp 

epinlqwvmakypvlyeesmntvlvqevirynr 

llqvitqtlqdllkalkglwmssqlelmaasl 

ynntvpelwsakaypslkplsswvmdllqrldf 

lqawiqdgipavfwisgfffpqafltgtlqnfar 

kfvisidtisfdfkvmfeapseltqrp.qvgcyfflg 

lflegarwdpeafqlaesqpkelytemaviwll 

ptpnrkaqdqdfyix:piyktlte^gtlsttghst 

nyviaveipthqpqrhwikrgvalicaldy 


3194 


A 


1 


1023 


dgwtpvhaavdtgnvdslkllmyhripahgns 

fneeesessvfdldggeespegiskpvvpadlinh 

anregwtaahiaaskgfkncleilcrhgglepe 

rw^kcnrtvhdvatddckhllenlnalkipm 

vgeiepsnygsddlecenticalnirkqtswddfs 

kavsqaltnhfqaissdgwwsledvtcnnttds 

niglsarsirsitlgnvpwsvgqsfaqspwdfmr 

knkaehitvllsgpqegclssvtyasmiplqmm 

qnylrlveqyhnvifhgpegslqdyiyhqlalcl 

khrqmgwqdspveiveelevgcwffpreqllrt 

cslva 


' 3195 ^ 


A 




1809 


maasaqvsvtfedvavtftqeewgqldaaqrt 

lyqevmletcgllmslgcplfkpeliyqldhrqe 

lwmatkdlsqssypgdntbcpktteptfshlalpe 

evllqeqltqgasknsqlgqskdqdgpsemqev 

hlkigigpqrgkllekmsserdglgsddgvctki 

tqkqvstegdlyecdshgpvtdalireeknsyk 

ceecgkvfkknallvqherihtqvkpyectecg 

ktfsksthllqhlnhtgekpykcmecgkafnrr 

shltrhqrihsgekpykcsecgkafthrstfvlh 

hrshtgekpfvckecgbcafrdrpgfirhynhtge 

kpyecieciecgkafnrrsyltwhqqihtgvkpf 

ecnecgkafcesadliqhyiihtgekpykcmecg 

kafmrshlkqhqrihtgekpyecsecgkafth 

cstfvlhkrthtgekpyeckecgkafsdradlir 

HFSIHTGEKPYECVECGKAFNRSSHLTRHQQIHT 
GEKPYECIQCGKAFCRSANLIRHSIIHTGEKPYEC 
SECGKAFNRGSSLTHHQRfflTGRNPUVTDVGRP 
FMTAQTSVNIQELLLGKEFLNITIBENLW 


3196 


A 


1400 


264 


VGFWERPLRSSRWFRRSLRRWEMLARAARGTG 

ALLLRGSLLASGRAPRRASSGLPRNTVVLFVPQQ 

EAWWERMGRFHRCLEPGLNILIPVLDRIRYVQSL 

KEIVINVPEQSAVTLDNVTLQIDGVLYLRIMDPY 

KASYGVEDPEYAVTQLAQTTMRSELGKLSLDKV 

FWERESLN ASrvDAiNyAAlJU WUIKOLK I cusx^in 

VPPRVKESMQMQVEAERRKRATVLESEGTRESA 

INVAEGKKQAQILASEAEKAEQINQAAGEASAVL 

AKAKAKAEAIRDLAAALTQHNGDAAASLTVAEQ 

YVSAFSKLAKDSNTILLPSNPGDVTSMVAQAMG 

VYGALTKAPVPGTPDSLSSGSSRDVQGTDASLDE 
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SEQD) 
NO: 


Method 


Predicted 

beginning 

nocleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequeoce 


Amino acid sequence (A=Alanine C<=Cysteine, D=Aspartic Acid, 
£=Glutamic Acid, F=Phenylalanine, (^Glycine, H-HistidiDe» 
I— Isoleucine, K=Lysioe, L=LeuciDe, M=Methionloe, 
N^^^Asparagine, P=ProIine, Q^GIutamine, R=Arginine,S=Serine, 
T«Threoninc, V=Valinc, W=Tryptopban, Y-Tyrosine, 
X=Unknown» *=Stop codon, A^ssible nucleotide deletion, 
V^possible nucleotide insertion 










ELDRVKMS 


3197 


A 


66 


3632 

. . V. 


lwecaaaaagqrdggvtlflkgrvlgrrcaas 

lfarevcvstsssrpacflhcarargeqmhqma 

sgvgsmkrsprkmwrpgekkepqgvvyedvrd 

dtedfkeplkwfegsayglqnfnkqkklktcd 

dmdtfflhyaaaegqielmekitrdsslevlhe 

mddygntplhcaveknqiesvkfllsrganpnl 

rnfnmmaplmavqgmnnevmkvllehrtddv 

nlegengntaviiacttnnsealqillnkgakpc 

ksnkwgcfpfflqaafsgskecmeiilrfgeehgy 

srqlhinfmnngkatplhlavqngdlemikmcl 

dngaqidpvekgrctaihfaatqgateivklmis 

sysgsvdivnttdgchetmlhraslfdhhelad 

ylisvgadinkidsegrsplilatasaswnivnll 

lskgaqvdikdnfgrnflhltvqqpyglknlrp 

efmqmqqikelvmdedndgctplhyacrqggp 

gsvnnllgfnvsihskskdkksplhfaasygrin 

tcqkllqdisdtrllnegdlhgmtplhlaakng 

hdkvvqlllkkgalflsdhngwtalhhasmgg 

ytq™kvildtnlkctdrldedgntalhfaare 

ghakavalllshnadivlnkqqasflhlalhnk 

rkewlturskrwdeclkifshnspgnkcpitem 

ieylpecmkvlldfcmlhstedkscrdyyieynf 

kylqcpleftkktptqdviyepltalnamvqnn 

riellnhpvckeyllmkwlaygfrahmmnlgs 

yclglipmtilvwikpgmafnstgiinetsdhsei 

lI)TTN'sihLlKTCMILVFi:SSIFGYeKEAGQIEQQK 

RkVTMDISNVLEWIIYTtGIIFVLPL^ 

WQCGAIAVYFYWMNFLLYLQRFENCGIFIVMLE 

VILKTLLRSTVVFIFLLLAFGLSFYILLNLQDPFSS 

PLLSIIOTFSMMLGDINYRESFLEPYLRNELAHPV 

LSFAQLVSFTIFVPIVLMNLLIGLAVGDIAEVQKH 

ASLKRIAMQVELHTSLEKKLPLWFLRKVDQKSTI 

VYPNKPRSGGMLFHIFCFLFCTGEIRQEIPNADKS 

LEMEILKQKYRLKDLTFLLEKQHELIKLnQKMEn 

SETEDDDSHCSFQDRFKKEQMEQRNSRWNTVLR 

AVKAKTHHLEP 


3198 


A 


51 


2177 


KEKSLHHVDQRPPLWHPGRPGTSQSAAMNASSE 

GESFAGSVQIPGGTTVLVELTPDIHICGICKQQFN 

NLDAFVAHKQSGCQLTGTSAAAPSTVQFVSEET 

WATQTQTTTRTITSETQTITVSAPEFVFEHGYQT 

YLPTESNENQTATVISLPAKSRTKKPTIPPAQKRL 

NCCYPGCQFKTAYGMKDMERHLKIHTGDKPHK 

CEVCGKCFSRKDKLKTHMRCHTGVKPYKCKTC 

DYAAADSSSLNKHLRfflSDERPFKCQICPYASRN 

SSQLTVHLRSHTGDAPFQCWLCSAKFKISSDLKR 

HMRVHSGEKPFKCEFCnWRCTMKGNLKSHIRIK 

HSGNNFKCPHCAFLGDSKATLRKHSRVHQSEHR 

EKCSECSYSCSSKAALRIHERIHCTVRPFKCNYCS 

FDSKQPSNLSKJIMKKFHGDMVKTEALERKDTG 

RQSSRQVAKLDAKKSFHCDICDASFMREDSLRS 

HKRQHSEYNESKNSDVTVLQFQIDPSKQPATPLT 

VGHLQVPLQPSQVPQFSEGRVKUVGHQVPQANT 

IVQAAAAAVNIVPPALVAQNPEELPGNSRLQILR 

QVSLIAPPQSSRCPSEAGAMTQPAVLLTTHEQTD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


[*redicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


4mino acid sequence (A«=Alanine C=K:ysteine, D=Aspartic Acid, 
i?=nitf«iitiir Arid F^Pliettvlalanine, G=Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, l^Leucine, WNMethionine, 
N=Asparaginc, P-Proline, Q^Glutamlne, R^Arginine, S«Serine, 
T=ThreoniDe, V^Valioe, W=Tryptophan, Y^Tyrosine, 
„ ■ * i^iim AlwiAn AsiwMcihlc nndcotilic dclctioila 

}(as|Jl||(]|OWII» ^^OtOp COOOII* #=pTO»IU« UUMWUW u^aciivuy 

V^possible niideotide insertioB 










GAlLHQTLIPTASGGPQEGSaNQTFITSSGrrCTD 

FEGLNALIQEGTAEVTWSDGGQNIAVATTAPPV 

FSSSSOOELPKOTYSnQGAAHPALLCPADSIPD 


3199 


A 


13 


2247 


QSFHSMEGDPSGLPLLARGASCYSLICPCPRPAD 

WSILQGTDWSILQSADWCIYNPLARHRALTGVFL 

QSADWCTYNPLARQKSSPSPHSTQEVQLASPLTR 

RPNKKDSAERNHRPAREGSVAQRQPNPAALEKA 

EPAARKRNEREGGGSQEPGREHSLEKGYWAPGL 

GPDPSMCSKQVDPSEGASSHLKHRGGSRAAHLE 

VRKLLRRLVGALVAEAGFCYVQVAEGQRWGV 

LEVAEAAAAPVQHEPTAAVATQSRWFPRGTRPG 

LCSLPIAVAALLCPGSGPGAQSGLEFVERPPPSPL 

AWLARWPLPPPAGRCPRDAPEARVPEKARAEG 

SERENNYGCGWGGEMTTLVLDNGAYNAKIGY 

SHENVSVIPNCQFRSKTARLKTFTANQIDEIKDPS 

GLFYILPFQKGYLVNWDVQRQVWDYLFGKEMY 

QVDFLDTNfflTEPYFNFTSIQESMNEILFEEYQFQ 

AVLRVNAGALSAHRYFRDNPSELCCnVDSGYSF 

THIVPYCRSKKKBCEAIIRINVGGKLLTNHLKEIISY 

RQLHVMDETEIVINQVKEDVCYVSQDFYRDMDI 

AKLKGEENTVMIDYVLPDFSTIKKGFCKPREEMV 

LSGKYKSGEQILRLANERFAVPEILFNPSDIGIQE 

MGIPEAIVYSIQNLPEEMQPHFFKNIVLTGGNSLF 

PGFRDRVYSEVRCLTPTDYDVSVVLPENPITYAW 

EGGKLISENDDFEDMWTREDYEENGHSVCEEK 

FDI 


3200 


A 


3 tr: 


:307 r 


AVQRIRHEMNIFRLTGDLSHLAAIVILLLKIWKTK 

SCAGISGKSQLLFALVFTrRYLDLFTSFISLYNTS 

MKVWYAIHRNVFHLQCTGLWTLNLCQLCIFN 


3201 


A 


1 


469 


IRHEGRGQRGKMELVQVLKRGLQQITGHUULKU 

YLRVFFRTNDAKVGTLVGEDKYGNKYYEDNKQ 

FFGREmWVVYTTEMNGKNTFWDVDGSMVPPE 

WHRWLHSMTDDPPTTXPLTARKFIWTNHKFNVT 

GTPEOYVPYSTTRKKIQEWIPPSTPYK 


3202 


A 


144 


840 


NSSQRMATHALEIAGLFLGGVGMVGTVAV 1 VM 

PQWRVSAFIENNIVVFENFWEGLWMNCVRQANI 

RMQCKIYDSLLALSPDLQAARGLMCAASVMSFL 

AFMMAILGMKCTRCTGDNEKVKAHILLTAGIIFn 

TGMWLPVSWVANAHRDFYNSIVNVAQKRELG 

EALYLGWTTALVLIVGGALFCCVFCCNEJCSSSYR 

YSIPSHRTTQKSYHTGKKSPSVYSRSQYV 


3203 


A 


2 


473 


" KYRYRRPYPVMRKICQVGPAGLAFILNISPVAHR 
VALCmAGCQEQAAWYHTLQELFFLVSAYFFSCP 
VPEKYFPGSCDIVGHGHQIFHAFLSICTLSQLEAIL 
LDYQGRQEIFLQRHGPLSVHMACLSFFFLAACSA 
ATAALLRHKVKARLTKKDS 


3204 


A 


1808 


668 


" PESAPLPAFISSRILPAAWRNWCSYWTRTISCHV 
QNGTYLQRVLQNCPWPMSCPGSSYRTVVRPTYK 
VMYKTVTAREWRCCPGHSRVSCEEVAGSSASLE 

mrarci^CTKjTRRMAT RPTAFSGCLNCSKVSELTER 

LK\^EAKMTMLTVffiQPVPPTPATPEDPAPLWGP 
PPAQGSPGDGGLQDQVGAWGLPGPTGPKGDAG 
SRGPMGMRGPPGDPLLSNTFTETNNHWPQGPTG 
PPQPPGPMGPPGPPGPTGVPGSPGHIGPPGPTGPK 
GISGHPGEKGERGLRGEPGPOGSAGQRGEPGPKG 
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SEQm 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine OCysteine, I>=Aspartic Acid, 
E=Glutaraic Acid, F=Phcnylalanine, G=<;iycinc, H=Hist!dinc, 
I^Isoleucine, K'=Lysine, Lp'Lcudne, M^Methionine, 
N=Asparagine, P=Pronne, Q^Glntamine, R-Arginine, S=^erine, 
T^Threonine, V=VaUne, W^Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, A=poss!ble nucleotide deletion, 
\=p05sible nucleotide Insertion 










DPGEKSHWGEGLHQLREALKILAERVLILETMIG 
LYEPELGSGAGPAGTGTPSLLRGKRGGHATNYRI 
VAPRSRDERG 


3205 


A 


2810 


1652 


RTSTQKWQSVFhnDSQEHIJERFyCNPENDRMRM 

KYGGQEFWADLNAMNVYETTEFDQLRRLSTPPS 

SNVNSrraXVWFFCRDHFGWREYPESVIRLIEE 

ANSRGLKEVRFMMWNNHYILHNSFFRREIKRRP 

LFRSCFILLPYLQTLGGVPTQAPPPLEATSSSQnCP 

DGVTSANFYPETWVYMHPSODFIOVPVSAEDKS 

yrhynlfhktvpefkyrilqilrvqnqfLweky 
krkkeymnrkmfgrdriinerhlfhgtsqdvvd 

GICBCHNFDPRVCGKHATMFGQGSYFAKKASYSH 
NFSKKSSKGVHFMFIJVKVLTGRYTMGSHGMRR 
PPPVNPGSVTSDLYDSCVDNFFEPQIFVIFNDDQS 
YPYFVIQYEEVSNTVSI 


3206 


A 


297 


4500 


CLVDSKLWKGARSVYHQLFMSSLLMDLKYKKL 

FAVRFAKNYERLQSDYVTDDHDREFSVADLSVQ 

IFTVPSLARMLITEENLMSmKTFMDHLRHRDAQ 

GRFQFERYTALQAFKFRRVQSLILDLKYVLISKPT 

EWSDELRQKFLEGFDAFLELLKCMQGMDPITRQ 

VGQHffiMEPEWEAAFTLQMKLTHVISMMQDWC 

ASDEKVLIEAYKKCLAVLMQCHGGYTDGEQPIT 

LSICGHSVETIRYCVSQEKVSIHLPVSRLLAGLHV 

LLSKSEVAYKFPELLPLSELSPPMLIEHPLRCLVL 

CAQVHAGMWRRNGFSLVNQIYYYHNVKCRRE 

MFDKDWMLQTGVSMJ^PNHFLMIMLSRFELY 

Q'lFSTPDYGKRFSSEiraKDWQQ>INTLffi^ 

nMLVGERFSPGVGQWATDEIKREIIHQLSIKPM 

AHSELVKSLPEDENKETGMESVIEAVAHFKKPGL 

TGRGMYELKPECAKEFNLYFYHFSRAEQSKAEE 

AQRKLKRQNREDTALPPPVLPPFCPLFASLVNILQ 

SDVMLCIMGTILQWAVEHNGYAWSESMLQRVL 

HLIGMALQEEKQHLENVTEEHVVTFl'FrQKISKP 

GEAPKNSPSILAMLETLQNAPYLEVHKDMIRWIL 

KTFNAVKKMRESSPTSPVAETEGTIMEESSRDKD 

KAERKRKAEIARLRREKMAQMSEMQRHFIDEN 

KELFQQTLELDASTSAVLDHSPVASDMTLTALGP 

AQTQVPEQRQFVTCILCQEEQEVKVESRAMVLA 

AFVQRSTVLSKNRSKFIQDPEKYDPLFMHPDLSC 

GTHTSSCGHIMHAHCWQRYFDSVQAKEQRRQQ 

RLRLHTSYDVENGEFLCPLCECLSNTVIPLLLPPR 

NIFN>nU.NFSDQPNLTQWIRTISQQIKALQFLRKE 

ESTPNNASTKNSENVDELQLPEGFRPDFRPKIPYS 

ESIKEMLTTFGTATYKVGLKVHPNEEDPRVPIMC 

WGSCAYnQSBERILSDEDKPLFGPLPCRLDDCLR 

SLTRFAAAHWTVASVSWQGHFCKPFASLVPND 

SHEELPCILDIDMFHLLVGLVLAFPALQCQDFSGI 

SLGTGDLHEFHLVTMAHnQILLTSCTEENGMDQE 

NPPCEEESAVLALYKTLHQYTGSALKEIPSGWHL 

WRSVRAGIMPFLKCSALFFHYLNGVPSPPDIQVP 

GTSHFEHLCSYLSLPNNLICLFQENSEIMNSLIES 

WCRNSEVKRYLEGERDAIRYPRESNKLINLPEDY 

SSLINQASNFSCPKSGGDKSRAPTLCLVCGSLLCS 

QSYCCQTELEGEDVGACTAHTYSCGSGVGDFLR 

VRECQVLFLAGKTKGCFYSPPYLDDYGETDQGL 







284 



PCTAJSOl/04098 

WO 01/57190 



1 SEQID 1 iV 
NO: 


[ethod 1 I 

1 I- 
1 

1 ' 
1 
c 
t 

i 
1 

Is 


►rcdictcd * 

Ludeotide I 
ocation ^ 
lorresponding t 
0 first amino a 
idd residue of | 
[)eptide ^ 
icqnence 


Predicted end A 
udeotide 1 
ocation 1 
orresponding 
0 last amino J 
tdd residue of 7 
leptide N 
(eqnence 


.mino add sequence (A-Alanine OCystdne, u=Aspardc Acifl. 
MSIutamic Add, i^Phcnylalanine, G=GIydnc H^Histidinc, 
«lsolcudne,K=L.ysinc,l^Uadne,M-=Methiomne, 
<=Asparagine, P«=ProUne, Q=Glutflmine, R-Arginine, S=Senne, 
r=Threomne, V=Valine, W==Tryptophan, Y«Tyrosine. 
C=Unknown, *=Stop codon, /^possible nudeotide ddetion, 
^possible nudeotide insertion 








J 

1 


RRGNPLHLCKJiKtiaaQKLWHQHSVTEiilGiiAg 
RATSIQTLVGIDWOHL 


3207 J 


k 


49 


963 


QLSPSQAPAGAQEVARRVTVGSASHGGRR5S1MA 

TTVSTQRGPVYIGELPQDFLRTIPTQQQRQVQLD 

AOAAWLQYGGAVGTVGRLNTTVVQAKLAKNY 

GMTRMDPYCRLRLGYAVYETPTAHNGAK^ 

xTVxrruPTVPPGVDSFYLEIFDERAFSMDDRIAWT 

HmPESLRQGKVEDKWYSLSGRQGDDKEGMINL 

VMSYALLPAAMVMPPQPWLMPTVYQQGVGY 

VPITGMFAVCSPGMVPVALPFAAVNAQPRCSEE 

DLKAlQDMFPNMDQEVmSVLEAQRGNKDAAIN 

SLLOMGEEP 


3208 


A 


54 


"1196 


LERTPASADMAWrKygLFLAGLMLVlGSlNiLS 

AKWADNFMAEGCGGSKEHSFQHPFLQAVGMFL 

GEFSCLAAFYLLRCRAAGQSDSSVDPQQPFNPLL 

FLPPALCDMTGTSLMYVALmrrSASSFQMLRGA 

VIIFTGLFSVAFLGRRLVLSQWLGILATIAGLVW 

GLADLLSKHDSQHKLSEVITGDLLnMAQnVAIQ 

■Kinn DcfiTv/wHMVTTPl R AVGTEGLFGFVILSLLL 

VPMYYIPAGSFSGNPRGTLEDALDAFCQVGQQP 

LIAVALLGNISSIAFFNFAGISVTKELSATTRMVL 

DSLRTWIWALSLALGWEAFHALQILGFLILLIGT 

ALYNGLHRPLLGRLSRGRPLAEESEQERLLGGTR 

TPTNDAS 


3209 


A 


1 104 


1999 


AKVVSLKEFSCFWRREKPVSSLSSLgVKAEASW 

DSAVHGCPQLSRGTPVDERLFLIVRVTVQLSHPA 

DMQLVLRKRICVNVHGRQGFAQSLLKKMSHRSS 

IPQCGVTFETVSNIPEDAQGVEEREALARMAANV 

ENPASADSEAYIEKYLRSVLAVENLLTLDRLRQE 

VAVKEQLTGKGKLSRRSISSPNVNRLSGSRQDLIP 

SYSLGSNKGRWESQQDVSQTTVSRGIAPAPALSV 

SPONNHSPDPGLSNLAASYLNPVKSFVPQMPKLL 

KSLFPVRDEKRGKRPSPLAHQPVPRIMVQSASPDI 

RVTRMEEAQPEMGPDVLVQTMGAPALKICDKP 

AKVPSPPPVIAVTAVTPAPEAQDGPPSPLSEASSG 

YFSHSVSTATLSDALGPGLDAAAPPGSMPTAPEA 

EPEAPISHPPPPTAVPAEEPPGPQQLVSPGRERPDL 

EAPAPGSPFRVRRVRASELRSFSRMLAGDPGCSP 

GAEGNAPAPGAGGQALASDSEEADEVPEWLKEG 

EFVTVGAHKTGVVRYVGPADFQEGTWVGVELD 

LPSGKNDGSIGGKQYFRCNPGYGLLVRPSRVRR 

A TYiPVRRRSTGLRLGAPEARRS ATLSGSATNLAS 

1 T A AT . AKADRSHKNPENRKSWAS 

" SPFWTEKRRMEKPLFPLVPLHWFGFGY lALVvS 


3210 


TA 


T324 


"^694 


GGIVGYVKTGSVPSLAAQLLFGSLAGLGAYQLY 
QDPRNVWGFLAATSVTFVGVMGMRSYYYGKF 
vrovrji lAfiAST LMAAKVGVRMLMTSD 


3211 


A 


1078 


594 


"VGMELPAVNLKVlLLGHWLLTlWCKJlVi-SCrtiyA 
WA>1FTILALGVWAVAQRDSIDAISMFLGQLLATI 
FLDIVfflSIFYPRVSLTDTGRFGVGMAILSLLLKPL 
SCCFVYHMYRERGGELLVHTGFLGSSQDRSAYQ 
TinSAEAPADPFAVPEGRSODARGY 


3212 


A 


1 


' 1962 


— FRCGLAPKORPRRRADPVASAIMDPAEAV1.QEK 
ALKPMMEFRSWCPGWNIMARSRLTATSTSRVQ 
CSMPRSLWLGCSSLADSMPSLRCLYNPGTGALT 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
I>=Glutamic Acid, F=Pbenylalanine, G-Glycine, H=Histidine, 
I»Isoleucine, K-Lysine, L^Lcucinc, M^Methionine, 
N»Asparagine, P=Proline, Q=GIutaraine, R^Arginine, S=Serine, 
T«Threonine, V«Valine, W=Tryptoplian, Y=Tyrosine, 
X=Unknown, *>=Stop codon, possible nocleotidc deletion, 
V=possible nucleotide insertion 










AFQNSSEREDCNNGEPPRKUPEKNSLRQTYNSCA 

RLCLNQETVCLASTAMKTENCVAKTKLANGTSS 

MTVPKQRKLSASYEKEKELCVKYFEQWSESDQV 

EFVEHLISQMCHYQHGHINSYLKPMLQRDFITAL 

PARGUDHIAENILSYLDAKSLCAAELVCKEWYR 

VTSDGMLWKKLIEBIMVRTDSLWRGLAERRGWG 

QYLFKNKPPDGNAPPNSFYRALYPKnQDEETIES 

NWRCGRHSLQRIHCRSETSKGVYCLQYDDQKIV 

SGLRDNTIKIWDKNTLECKRILTGHTGSVLCLQY 

DERVIITGSSDSTVRVWDVNTGEMLNTLIHHCEA 

VLHlJa5>INGMMVTCSKDRSlAVWDMASPTDITL 

RRVLVGHRAAVNWDFDDKYIVSASGDRTIKV 

WNTSTCEFVRTLNGHKRGIACLQYRDRLWSGS 

SDNTIRLWDIECGACLRVLEGHEELVRCIRFDNK 

RIVSGAYDGKIKVWDLVAALDPRAPAGTLCLRT 

LVEHSGRVFRLQFDEFQIVSSSHDDTILIWDFLND 

PAAQSEPPRSPSRTYTYISR 


3213 


A 


1 


1962 


FRCGLAPKGRPRRRADPVASAIMDPAEAVLQEK 

ALKFMMEFRSWCPGWNTMARSRLTATSTSRVQ 

CSMPRSLWLGCSSLADSMPSLRCLYNPGTGALT 

AFQNSSEREDCNNGEPPRKUPEKNSLRQTYNSCA 

RLCLNQETVCLASTAMKTENCVAKTKLANGTSS 

mivpkqrklsasyekekelcvkyfeqwsesdqv 

efvehlisqmchyqhghinsylkpmlqrdnxal 

pargldhiaenilsyldakslcaaelvckewyr 

vtsdgmlwkkliermvrtdslwrglaerrgwg 

qylfkkkppdgnappnsfy^^^ 

nMcgrhslqrihcrsetskgvyclqyddqkiv 

sglrdntikiwdkntleckriltghtgsvlclqy 

dervnxgssdstvrvwdvntgemlntlihhcea 

VLHLRFNNGMMVTCSKDRSIAVWDMASPTDITL 

RRVLVGHRAAVNWDFDDKYIVSASGDRTIKV 

WNTSTCEFVRTLNGHKRGIACLQYRDRLVVSGS 

SDNTIRLWDIECGACLRVLEGHEELVRCIRFDNK 

RIVSGAYDGKIKVWDLVAALDPRAPAGTLCLRT 

LVEHSGRVFRLQFDEFQIVSSSHDDTILIWDFLND 

PAAQSEPPRSPSRTYTYISR 


3224 


A 


1 


1962 


FRCGLAPKGRPRRRADPVASAIMDPAEAVLQEK 

ALKFMMEFRSWCPGWNTMARSRLTATSTSRVQ 

CSMPRSLWLGCSSLADSMPSLRCLYNPGTGALT 

AFQNSSEREDCNNGEPPRKnPEKNSLRQTYNSCA 

RLCLNQETVCLASTAMKTENCVAKTKLANGTSS 

MIVPKQRKLSASYEKEKELCVKYFEQWSESDQV 

EFVEHLISQMCHYQHGHINSYLKPMLQRDFITAL 

PARGLDHIAENILSYLDAKSLCAAELVCKEWYR 

VTSDGMLWKKLIERMVRTDSLWRGLAERRGWG 

QYLFKNKPPDGNAPPNSFYRALYPKIIQDIETIES 

NWRCGRHSLQRIHCRSETSKGVYCLQYDDQKIV 

SGLRDNTIKIWDKNT1.ECKRILTGHTGSVLCLQY 

DERVIITGSSDSTVRVWDVNTGEMLNTLIHHCEA 

VLHLRFNNGMMVTCSKDRSIAVWDMASPTDITL 

RRVLVGHRAAVNWDFDDKYIVSASGDRTIKV 

WNTSTCEFVRTLNGHKRGIACLQYRDRLWSGS 

SDNTIRLWDIECGACLRVLEGHEELVRCIRFDNK 

RIVSGAYDGKIKVWDLVAALDPRAPAGTLCLRT 
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SEQID P 
NO: 


Method 1 
I 

I 
1 

( 
1 


•redictcd * 
>eginaiDg r 
ludeotide 1 
oca u on ^ 
:orrcsponding t 
0 first amino i 
acid residue of 
)eptide J 
sequence 


Predicted end / 
locleotide 1 
ocation ^ 
orresponding T 
0 last amino 
icid residue of ) 
peptide ^ 
sequence 


imiuo acid sequence (A=Alanine C=Cysteine, ^^^^^^^^'^^ 

, ^, , . 1?— Phi^nvlfllanine. G=GIvcme. H^Histidine, 

/^Glutamic ACiOf r— rncDyiaiduiuc, ^—vnj 

=Isolcucinc, K=Lysinc l^Leucine, MHMethionine. 
4=Asparagine,P=Prolinc Q=Glutamine. R=Argimne, S=Serinc, 
r^Threonine, V=VaIine, W^Tryptophan, Y=Tyrosinc 
t«Unknown, *=Stop codon, A-possible nucleotide deletioD, 
^possible nucleotide insertion 










LVSiSGRVFRLQFDlityi V SSSHDDTILIWUt LNU 
PAAOSEPPRSPSRTYTYISR 


3215 


A 


2 


1376 


EARLVGCQRGGPARPGSYSSGAETAOKAMAAN 

LSRNGPALQEAYVRWTEKSPTDWALFTYEGNS 

NDIRVAGTGEGGLEEMVEELNSGKVMYAFCRV 

KDPNSGLPKFVLINWTGEGVNDVRtCGACASHVS 

TMASFLKGAHVTTNARAEEDVEPBCIMEKVAKA 

SGANYSFHKESGRFQDVGPQAPVGSVYQKTNAV 

SEIKRVGKDSFWAKAEKEEENRRLEEKRRAEEA 

ORQLEQERRERELREAARREQRYQEQGGEASPQ 

RTWEQQQEWSRNRNEQESAVHPREIFKQKERA 

MSTTSISSPQFGKLRSPFLQKQLTQPETHFGREPA 

AAISRPRADLPAEEPAPSTPPCLVQAEEEAVYEEP 

PEOETFYEQPPLVQQQGAGSEHTOHfflQGQGLSG 

QGLCARALYDYQAADDTEISFDPENLITGffiVIDE 

GWWRGYGPDGHFGMFPANYVELIE 


3216 


A 


936 


204 


AMASTLEYSPSPLRRLVGPAAGFSRAARAULbW 

DPMAFFTGLWGPFTCVSRVLSHHCFSTTGSLSAI 

OKMTRVRWDNSALGNSPYHRAPRCIHVYKKN 

GVGKVGDQILLAIKGQKKKALIVGHCMPGPRMT 

PRFDSNNVVLIEDNGNPVGTRIKTPIPTSLRKREG 

EYSKVLAIAQNFV 


3217 


A 


1 


1563 


MLCALLLLPSLLGATRASPTSGPQECAKGS 1 V W 
CODLQTAARCGAVGYCQGAVWNKPTAKSLPCD 
VCQDLAAAAGNGLNPDATESDILALVMKTCEWL 
PSQESSAGCKWMVDAHSSAILSMLRGAPDSAPA 
. OVGTALSLCEPLQRHLATLRPLSKEDTFEAVAPF 
MANGPLTFHPRQAPEGALCQDCVRQVSRLQEAV 
RSNLTLADLNIQEQCESLGPGLAVLCKNYLFQFF 
VPADQALRLLPPQELCRKGGFCEELGAPARLTQ 
WAMDGVPSLELGLPRKQSEMQMKAGVTCEVC 
MNVVQKLDHWLMSNSSELMITHALERVCSVMP 
ASITKECnLVDTYSPSLVQLVAKlTPEKVCKFERL 
CGNRRRARAVHDAYAIVPSPEWDAENQGSFCNG 
CKRLLTVSSHNLESKSTKRDILVAFKGGCSILPLP 
YMIQCKHFVTQYEPVLIESLKDMMDPVAVCKKV 
GACHGPRTPLLGTDQCALGPSFWCRSQEAAKLC 
NAVOHCOKHVWKEMHLHAGEHA 


3218 
1 3219 


A 
A 


1 

1623 


1563 

572 


" MLCALLLLPSLLGATRASPTSGPQECAKUSI VW 
CQDLQTAARCGAVGYCQGAVWNKPTAKSLPCD 
VCQDIAAAAGNGLNPDATESDILALVMKTCEWL 
PSOESSAGCKWMVDAHSSAILSMLRGAPDSAPA 
QVCTALSLCEPLQRHLATLRPLSKEDTFEAVAPF 
MANGPLTFHPRQAPEGALCQDCVRQVSRLQEAV 
RSNLTLADLNIQEQCESLGPGLAVLCKNYLFQFF 
VPADQALRLLPPQELCRKGGFCEELQAPARLTQ 
VVAMDGVPSLELGLPRKQSEMQMKAGVTCEVC 
\INVVQKLDHWLMSNSSELMITHALERVCSVMP 
ASriKECnLVDTYSPSLVQLVAKITPEKVCKFIRL 
r-r^MPTJB AP A VHDAYAIVPSPEWDAENQGSFCNG 
CKRLLTVSSHNLESKSTKRDILVAFKGGCSILPLP 
YMIQCKHFVTX}YEPVLIESLKDMMDPVAVCKKV 
GACHGPRTPLLGTDQCALGPSFWCRSQEAAKLC 

•NAVQHCQKHVWKEMHLHAGEHA 
TSAEQWKGCTCTFKDRSKLREHLRSHlX^lilf^v v A 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc C=Cysteine, D=Aspartic Add, 
E=Glutamic Add, F=Phenylalanine, G^GIycine, H=Histidine, 
I^Isoleucine, K=Lysine, L=Leudne, M^Methionine, 
N=Asparagine, P=Prolinc, Q=Glntamioe, R=Arginine, S^Serinc, 
T^Tbreonine, V=Valinc W=Tryptophan, Y^TVrosine, 
X^Unknown, *-Stop codon, A^ssible nudeotide ddetion, 
V^possible nudeotide insertion 










CPTCGGMFANNTKFLDHIRRQTSLDQQHFQCSH 

CSKRFATERLLRDHMRNHVNHYKCPLCDMTCPL 

PSSLRNHMRFRHSEDRPFKCDCCDYSCKNLIDLQ 

KHLDTHSEEPAYRCDFENCTFSARSLCSIKSHYR 

KVHEGDSEPRYKCHVCDKCFTRGNNLTVHLRK 

KHQFKWPSGHPRFRYKEHEDGYMRLQLVRYES 

VELTQQLLRQPQEGSGLGTSLNESSLQGIILETVP 

GEPGRKEEEEEGKGSEGTALSASQDNPSSVIHW 

NQTNAQGQQEIVYYVLSEAPGEPPPVPEPPSGGI 

MEKLQGIAEEPEIQMV 


3220 


A 


2760 


745 


SLGIPSGNTRGTGLVLDGDTSYTYHLVCMGPEAS 
GWGQDEPQTWPTDHRAQQGVQRQGVSYSVHA 
YTGQPSPRGLHSENREDEGWQVYRLGARDAHQ 
GRPTWALRPEDGEDKEMKTYRLDAGDADPRRL 
CDLERERWAVIQGQAVRKSSTVATLQGTPDHGD 
PRTPGPPRSTPLEENWDREQIDFLAARQQFLSLE 
QANKGAPHSSPARGTPAGTTPGASQAPKAFNKP 
HLANGHWPIKPQVKGVVREENKVRAVPTWAS 
VQWDDPGSLASVESPGTPKETPIEREIRLAQERE 
ADLREQRGLRQATDHQELVEIPTRPLLTKLSLITA 
PRRERGRPSLYVQRDIVQETQREEDHRREGLHV 
GRASfPDWVSEGPQPGLRRALSSDSILSPAPDAR 
AADPAPEVRKVNRIPPDAYQPYLSPGTPQLEFSA 
FGAFGKPSSLSTAEAKAATSPKATMSPRHLSESS 
GlCPLSTKOEASKPPRGCPOANRGVAniWEYFRLR 
.PLRFJIAPDEPQQAQVPHVWGWEVAGAPALRLQ 
kSQSSDLLERERESVLRREQEVAEERRNALFPEV 
FSPlPDiBNSDQNSRSSSQASGITGSYSVSESPFFSPI 
HLHSNVAWTVEDPVDSAPPGQRKKEQWYAGIN 
PSDGINSEVLEAIRVTRHKNAMAERWESRIYASE 
EDD 


3221 


A 


15 


478 


SRVFFFFFFFPAFKMSKRGRGGSSGAKFRISLGLP 
VGAVDSrCADNTGAKNLYnSVKGIKGRLNRLPAA 
GVGDMVMATVKKGKPELRKKVHPAVVIRQRKS 
YRRKDGVFLYFEDNAGVIVNNKGEMKGSAITGP 
VAKECADLWPRIASNAGSIA 


3222 


A 


207 


1321 


PLIPLHPANRSPATMAELQEVQITEEKPLLPGQTP 

EAAKIHSVETPYGSVTFTVYGTPKPKRPAILTYH 

DVGLNYKSCFQPLFQFEDMQEHQNFVRVHVDAP 

GMEEGAPVFPLGYQYPSLDQLADMIPCVLQYLN 

FSTUGVGVGAGAYILARYALNHPDTVEGLVLINI 

DPNAKGWMDWAAHKLTGLTSSIPEMILGHLFSQ 

EELSGNSELIQKYRNnTHAPNLDNIELYWNSYNN 

RRDLNFERGGDITLRCPVMLWGDQAPHEDAW 

ECNSKLDPTQTSFLBCMADSGGQPQLTQPGKLTE 

AFKYFLQGMGYMASSCMTRLSRSRTASLTSAAS 

VDGNRSRSRTLSQSSESGTLSSGPPGHTMEVSC 


3223 


A 


132 


1664 


SARRWGAAGAGPHGLHLRAHGPRPSVRTGLPSV 

GRQAAGAAMGRGWGFLFGLLGAVWLLSSGHGE 

EQPPETAAQRCFCQVSGYLDDCTCDVETIDRFNN 

YRLFPRLQKLLESDYFRYYKVNLKRPCPFWNDIS 

QCGRRDCAVKPCQSDEVPDGIKSASYKYSEEAN 

NLEEECEQAERLGAVDESLSEETQKAVLQWTKH 

DDSSDNFCEADDIQSPEAEYVDLLLNPERYTGYK 

GPDAWKIWNVIYEENCFKPQTIKRPLNPLASGQG 
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Amino acid sequence (A«Atenine OCystcine, D=Aspartic Acid, 
EXilutamic Acid, F=PheiiylaIanlne, G=Glycinc, H=Histidinc, 
I»Isoleucine, K^Lysine, Lr^Leudne, M=Mcthioninc, 
N^Asparagine. P=ProHnc Q-Glntaminc, R=Arginine, S^Serine, 
T^Threoninc, V=Vaiine, W=Tryptophan, Y=Tyrosine, 
X^^nknown, *««top codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion 



SEQH) 
NO: 



3224 



Method 



Predicted 
begfnniog 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

sequence 



3225 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 



803 



5054 



TSEENTFYSWLEGLCVEKRAFYRLISGLHASINV 

HLSARYLLQETWLEKKWGHNITEFQQRFDGILTE 

GEGPRRLKKLYFLYLIELRALSKVLPFFERPDFQL 

FTGNKIQDEENKMLLLEILHEIKSFPLHFDENSFF 

AGDKKEAHKLKEDFRLHFRNISRIMDCVGCFKC 

RLWGKLQTQGLGTALKILFSEKLIANMPESGPSY 

EFHLTRQEIVSLFNAFGRISYKCERIRKTSRNLLQ 

NIH 



PGSTISWDRDAAGESGTRAASPSPSGSRTAGRLi' 
SPSYSPLPAPSLFPPPPLPAPAASTMSAGGDFGNP 
LRKFKLVFLGEQSVGKTSLITRFMYDSFDNTYQA 
TIGIDFLSKTMYLEDRTVRLQLWDTAGQERFRSL 
PSYIRDSTVAVWYDITNLNSFQQTSKWIDDYRT 
ERGSDVnMLVGNKTDLADKRQITIEEGEQRAKE 
LSVMFIETSAKTGYNVKQLFRRVASALPGMENV 
QEKSKEGMIDIKLDKPQEPPASEGGCSC 



PEVTKPSLSQPTAASPIGSSPSPPVNGGNMAKKVA 
VPNGQPPSAARYMPREVPPRFRCQQDHKVLLKR 
GQPPPPSCMLLGGGAGPPPCTAPGANPNNAQVT 
GALLQSESGTAPDSTLGGAAASNYANSTWGSGA 
SSIWGTSPNPIHIWDKVIVDGSDMEEWPCIASKD 
TESSSENTTDNNSASNPGSEKSTLPGSTTSNKGK 
GSQCQSASSGNECNLGVWKSDPKAKSVQSSNST 
TENNNGLGNWRNVSGQDRIGPGSGFSNFNPNSN 
PSAWPALVQEGTSRKGALETDNSNSSAQVSTVG 
QTSREQQSKMENAGVNFWSGREQAQIHNTDGP 
KNGNTNSLNLSSPNPMENKGMPFGMGLGNTSRS 
TDAPSQSTGDRKTCSVGSWGAARGPSGTDTVSG 
QSNSGNNGNNGKEREDSWKGASVQKSTGSKND 
SWDNNmSTGGSWNFGPQDSNDNKWGEGNKM 
TSGVSQGEWKQPTGSDELKIGEWSGPNQPNSST 
GAWDNQKGHPLLENQGNAQAPCWGRSSSSTGS 
EVEGQSTGSNHKAGSSDSHNSGRRSYRPTHPDC 
QAVLQTLLSRTDLDPRVLSNTGWGQTQIKQDTV 
WDDEEVPRPEGKSDKGTEGWESAATQTKNSGG 
WGDAPSQSNQMKSGWGELSASTEWKDPKNTGG 
WNDYKNNNSSNWGGGRPDEKTPSSWNENPSKD 
QGWGGGRQPNQGWSSGKNGWGEEVDQTKNSN 
WESSASKPVSGWGEGGQMEIGTWGNGGNASLA 
SKGGWEDCKRSPAWNETGRQPNSAVNKQHQQQ 
QPPQQPPPPQPEASGSWGGPPPPPPGNVRPSNSS 
WSSGPQPATPKDEEPSGWEEPSPQSISRKMDIDD 
GTSAWGDPNSYNYKNVNLWDKNSQGGPAPREP 
NLPTPMTSKSASDSKSMQDGWGESDGPVTGARH 
PSWEEEEDGGVWNTTGSQGSASSHNSASWGQG 
GKKQMKCSLKGGNNDSWMNPLAKQFSNMGLL 
SQTEDNPSSKMDLSVGSLSDKKFDVDKRAMNLG 
DFhnDIMRKDRSGFRPPNSKDMGTTDSGPYFEKG 
GSHGLFGNSTAQSRGLHTPVQPLNSSPSLRAQVP 
PQFISPQVSASMLKQFPNSGLSPGLFNVGPQLSPQ 
QIAMLSQLPQIPQFQLACQLLLQQQQQQQLLQN 

QRKISQAVRQQQEQQLARMVSALQQQQQQQQR 
QPGMKHSPSHPVGPKPHLDNMVPNALNVGLPDL 
QTKGPrPGYGSGFSSGGMDYGMVGGKEAGlESR 
FKQWTSMMEGLPSVATQEANMHKNGAIVAPGK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 

nadeotide 

location 

corresponding . 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A-Alanine C=Cysteine, D=Aspartic Add, 
£F=G]utamic Acid, F^Pbenylalanine, G<=Glydne, H^Histidine, 
I=Isoleudne, K=Lysine, L^Leucine, M=Metbionine, 
N=Asparagine,-P=Proline, Q=K3lutamine» R==Arginine, Serine, 
T=Tbreonine, V=Valine, W^Tryptopban, Y=Tyrosine, 
X«>Unknown, *=Stop codon, A»possibIe nucleotide deletion, 
V^possible nudeotide insertion 










TRGGSPYNQFDEPGDTLGGHTGPAGDSWLPAKS 

PPTNKIGSKSSNASWPPEFQPGVPWKGIQNIDPES 

DPYVTPGSVLGGTATSPIVDTDHQLLRDNTTGSN 

SSLNTSLPSPGAWPYSASDNSFTNVHSTSAKFPD 

YKSTWSPDPIGHNPTHLSNKMWKNmSSRNTTPL 

PRPPPGLTNPKPSSPWSSTAPRSVRGWGTQDSRL 

ASASTWSDGGSVRPSYWLVLHNLTPQIDGSTLRT 

ICMQHGPLLTFHLNLTQGTALIRYSTKQEAAKAQ 

TALHMCVLGNTTILAEFATDDEVSRFLAQAQPPT 

PAATPSAPAAGWQSLETGQNQSDPVGPALNLFG 

GSTGLGQWSSSAGGSSGADLAGASLWGPPNYSS 

SLWGVPTVEDPHRMGSPAPLLPGDLLGGGSDSI 


3226 


A 


200 


1387 


vpwkrqdeqlslqvetlyldspavihllsptflp 

psslppflqivdssssactldsffpflapwdspqdc 

gfkdhqpltlqaltvelarwtlmlllstamyg 

ahapllalchvdgrvpfrpssavllteltklllc 

afsllvgwqawpqgpppwrqaapfalsallyg 

annnlviylqrymdpstyqvlsnlkigstavly 

clclrhrlsvrqglAllllmaagacyaagglq 

vpgntlpspppaaaaspmplhitplgllllilycli 

sglssvytellmkrqrlplalqnlflytfgvlln 

lglhagggsgpgllegfsgwaalvvlsqalngl 

lmsavmkhgssitrlfwscslvvnavlsavll 

rlqltaafflatluglamrlyygsr 


3227 


A 


1 


679 


rstrartrrpglravplpvggflgkmkwvwal 

lllaalgsgraerdcrvssfrvkenfdkarfsgt 

wyamakkdpeglflQd 

TAKGRX^LNNWDVCADMVGTFTDTEDPAKFK 

mkywgvasflqkgnddhwivdtdydtyavqy 

scrllnldgtcadsysfvfsrdpnglppeaqkiv 
rqrqeelclarqyrlivhngycdgrsernll 


3228 


A 


430 


1104 


qqespaagaarmnckegtdsscgcrgndekkm 
lkcvvvgdgavgktcllmsyandafpeeyvpt 
vfdhyavtvtvggkqhllglydtagqedynql 
rplsypntdvflicfsvvnpasyhnvqeewvpel 
kdcmphvpyvligtqidlrddpktlarllymke 
kpltyehgvklakaigaqcylecsaltqkglka 
vfdeailtifhpkkkkkrcseghsccsn 


3229 


A 


25 


722 


aisagrsakmqlkpmeinpemlnkvlsrlgvag 

qwrfvdvlgleeeslgsvpapacallllfpltaq 

henfrkkqieelkgqevspkvyfmkqtignscgt 

iglihavannqdklgfedgsvlkqflsetekmsp 

edrakcfekneaiqaahdavaqegqcrvddkv 

nfhfilfnnvdghlyeldgrmpfpvnhgassedt 

LLKDAAKVCREFTEREQGEVRPSAVALCKAA 


3230 


A 


282 


1479 


GDAATTACAPPDWFLGPRKLAAGPAGGGMLPR 

RLLAAWLAGTRGGGLLALLANQCRFVTGLRVR 

RAQQIAQLYGRLYSESSRRVLLGRLWRRLHGRP 

GHASALMAALAGVFVWDEERIQEEELQRSINEM 

KRLEEMSNMFQSSGVQHHPPEPKAQTEGNEDSE 

GKEQRWEMVMDKKHFKLWRRPITGTHLYQYRV 

FGTYTDVTPRQFFNVQLDTEYRKKWDALVIKLE 

VERDVVSGSEVLHWTHFPYPMYSRDYVYVRR 

YSVDQENNMMVLVSRAVEHPSVPESPEFVRVRS 

YESQMVIRPHKSFDENGFDYLLTYSDNPQTVFPR 
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SEQID I 
NO: 


Vlettaod I ] 

1 
] 
1 

1 
1 


Predicted J 
[)egibnmg i 
nucleotide 
ocation 
corresponding 
to first amino 
acid residue of 
peptide 

sequence 


Predicted end > 
nucleotide 
ocation 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Vraino acid sequence (A=Alanine C-Cystcine, l)=Aspartic Acid, 
fr-riiitamic Acid. F^Phenvlalanlnc, G=GIycine, H=Histidinc, 
[=lsolcucinc, K=Lysinc I^teuclnc, M^Methionine, 
V=Asparagine, P=Proline, Q^Glutamine, R«Arginine, S=Sennc, 
r=Threonine, V^Valinc, W=Tryptophan, Y=Tyrosine, 
tr Ti.i.__«tM *— Sinn MiHnn ^nMssiblc nuclcotidc ddetioii* 
^possible niideotide insertion 










YCVS\Nn!^SSGMPDia.EKLHMATLKABtNMElKV 
KDYISAKPLEMSSEAKATSQSSERKNEGSCGPAR 

lEYA 


3231 


A 


2117 


590 


FVPEPPEAGASSPCAPGDPDMSFRKWRgSKJbKH 

VFGQPVKNDQCYEDIRVSRVTWDSTFCAVNPKF 

LAVIVEASGGGAFLVLPLSKTGRIDKAYPTVCGH 

TGPVLDroWCPHNDEVIASGSEDCTVMVWQIPE 

NGLTSPLTEPWVLEGHTKRVGIIAWHPTARNVL 

LSAGCDNWLIWNVGTAEELYRLDSLHPDLIYN 

VSWNHNGSLFCSACKDKSVRIIDPRRGTLVAERE 

KAHEGARPMRAIFLADGKVFTTGFSRMSERQLA 

LWDPENLEEPMALQELDSSNGALLPFYDPDTSV 

VYVCGKGDSSIRYFEnEEPPYIHFLNTFTSKEPQR 

GMGSMPKRGLEVSKCEIARFYKLHERKCEPrVM 

TVPRKSDLFQDDLYPDTAGPEAALEAEEWVSGR 

DADPILISLREAYYPSKQRDLKISRRNVLSDSRPA 

MAPGSSHLGAPASTTTAADATPSGSLARAGEAG 

BXEEVMQELRALRALVKEQGDRICRLEEQLGRM 

ENGDA 




A 


3 


718 


RLREDDRRGLPLSSPLWTEPPLSCCLPATYPAUM 

GTAGAMQLCWVILGFLLFRGHNSQPTMTQTSSS 

OGGLGGLSLTTEPVSSNPGYIPSSEANRPSHLSST 

GTPGAGVPSSGRDGGTSRDTFQTVPPNSTTMSLS 

MREDATILPSPTSETVLTVAAFGVISFIVILVVWI 

ILVGWSLRFKCRKSKESEDPQKPGSSGLSESCST 

ANGEKDSITLISMKNINMNNGKQSLSAEKVL 


1 -59^7 






718 - ; / 


RliREDDRRGLPLSSPLWTBPPLSCCLPAiYJ'AiJM 

GTAGAMQLCWVILGFLLFRGHNSQPTMTQTSSS 

QGGLGGLSLTTEPVSSNPGYIPSSEANRPSHLSST 

GTPGAGVPSSGRDGGTSRDTFQTVPPNSTTMSLS 

MREDATILPSPTSETVLTVAAFGVISFIVELVWVI 

mVQWSLRFKCRKSKESEDPQKPGSSGLSESCST 

ANGEKDSITLISMKNINMNNGKQSLSAEKVL 


3234 


A 


1169 


4292 


AGDCGRLGVGGSEFPWEGSALGASPLPPICLQSK 

TWLLRAPAPAELGELEEVAAGRGDVWEPFLDSP 

GREESLQEASPRLADHGSSSGGGWEVKRSQRLR 

RGPSSPRRPYQDMEYERRGGRGDRTGRYGATDR 

SQDDGGENRSRDHDYRDMDYRSYPREYGSQEG 

KHDYDDSSEEQSAEDSYEASPGSETQRRRKRRH 

RHSPTGPPGFPRDGDYRDQDYRTEQGEEEEEEED 

EEEEEKASNIVMLRMLPQAATEDDIRGQLQSHG 

VQAREVRLMRNKSSGQSRGFAFVEFSHLQDATR 

WMEANQHSLNILGQKVSMHYSDPKPKINEDWL 

CNKCGVQNFKRREKCFKCGVPKSEAEQKLPLGT 

RLDQQTLPLGGRELSQGLLPLPQPYQAQGVLAS 

QALSQGSEPSSENANDTIILRNLNPHSTMDSELGA 

LAPYAVLSSSNVRVKDKQTQLNRGFAHQLSTIE 

AAQLLQILQALHPPLTIDGKTrNVEFAKGSKRDM 

ASNEGSRISAASVASTAIAAAQWAISQASQGGEG 

TWATSFFPPVDYSYYOODEGYGNSQGTESSLYA 

HGYLKGTKGPGITGTKGDPTGAGPEASLEPGADS 

VSMQAFSRPQPGAAPGIYQQSAEASSSQGTAANS 

OSYTIMSPAVLKSELQSPTHPSSALPPATSPTAQE 

SYSQYPVPDVSTYQYDETSGYYYDPQTGLYYDP 

NSOYYYNAOSQQYLYWDGERRTYVPALEQSAD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine CNDysteinC) D^Aspartlc Acid, 
E>=GlutaRiic Acid, F^Phenylalanine, G=GIycine, H-Histidine, 
I»Isoleucine, K=Lysine, tr=Leucine, M=Methionine, 
N=Asparaginc, P=ProIine, Q=Giutaniine, R^Arginine, S=Serine, 
T^Threonine, V=VaJine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, A^ossible nucleotide deletion, 
V=possible nucleotide insertioo 










GHKETGAPSKEGKEKKEKHKTKTAQQIAKDME 

RWARSLNKQKENFKNSFQPISSLRDDERRESATA 

DAGYAILEKKGALAERQHTSMDLPKLASDDRPS 

PPRGLVAAYSGESDSEEEQERGGPEREEKLTDW 

QKLACLLCRRQFPSKEALIRHQQLSGLHKQNLEI 

HRRAHLSENELEALEKNDMEQMKYRDRAAERR 

EKYGIPEPPEPBCRRKYGGISTASVDFEQPTRDGLG 

SDMGSRMLQAMGWKEGSGLGRKKQGIVTPIEA 

QTRVRGSGLGARGSSYGVTSTESYKETLHKTMV 

TRFNEAQ 


3235 


A 


3 


1217 


PSFLNTGLGPTALGVLGGAGAGLMSNPSPQVPEE 

EASTSVCRPKSSMASTSRRQRRERRFRRYLSAGR 

LVRAQALLQRHPGLDVDAGQPPPLHRACARHD 

APALCLLLRLGADPAHQDRHGDTALHAAARQG 

PDAYTOFFLPLLSRCPSAMGIKNKDGETPGQILG 

WGPPWDSAEEEEEDDASKEREWRQKLQGELED 

EWQEVMGRFEGDASHETQEPESFSAWSDRLARE 

HAQKCQQQQREAEGSCRPPRAEGSSQSWRQQEE 

EQRLFRERARAKEEELRESRARRAQEALGDREP 

KPTRAGPREEHPRGAGRGSLWRFGDVPWPCPGG 

GDPEAMAAALVARGPPLEEQGALRRYLRVQQV 

RWHPDRFLQRFRSQIETWELGRVMGAVTALSQA 

LNRHAEALK 


3236 


A 




1416 


GPASGMAEPTSDFETPIGWHASPELTPTLGPLSDT 
APPRDRWMFWAMLPPPPPPLTSSLPAAGSKPSSE 
SQPPMEAQSLPGAPPPFDAQBLPGAQPPFDAQSPL 
'^DSQPQPSGQPWNFHASTSWYWRQSSDRFPRHQK 
SLOTAVKNSYYPRKYDAKFTDFSLPPSRK^ 
KRKEPVFHFFCDTCDRGFKNQEKYDKHMSEHTK 
CPELDCSFTAHEKIVQFHWRNMHAPGMKKIKLD 
TPEEIARWREEREIKNYPTLANIERKKKLKLEKEK 
RGAVLTTTQYGKMKGMSRHSQMAKIRSPGKNH 
KWKNDNSRQRAVTGSGSHLCDLKLEGPPEANA 
DPLGVLINSDSESDKEEKPQHSVIPKEVTPALCSL 
MSSYGSLSGSESEPEETPIKTEADVLAENQVLDSS 
APKSPSQDVKATVRNFSEAKSENRKKSFEKTNPK 
REKRLSQLSNVIRTKNTPSISLGNASSSGHST 


3237 


A 


3806 


2204 


FVGEQEGGCEAGAGRGAQTYPGEAGERWFGRR 

RRRGRVVSRKKMSLKSERRGIHVDQSDLLCKKG 

CGYYGNPAWQGFCSKCWREEYHKARQKQIQED 

WELAERLQREEEEAFASSQSSQGAQSLTFSKFEE 

KKThffiKTRKVTTVKKFFSASSRVGSKKJEIQEAK^ 

PSPSINRQTSIETDRVSKEFIEFLKTFHKTGQEIYK 

QTKUl.EGMHYKRDLSffiEQSECAQDFYHNVAE 

RMQTRGKVPPERVEKIMDQIEKYIMTRLYKYVF 

CPETTDDEKKDIAIQKRIRALRWVTPQMLCVPV 

NEDIPEVSDMVVKAITDIIENIDSKRVPRDKLACIT 

KCSKHIFNAIKITKNEPASADDFLPTLIYIVLKGNP 

PRLQSNIQYITRFCNPSRLMTGEDGYYFTNLCCA 

VAFBEKLDAQSLNLSQEDFDRYMSGQTSPRKQEA 

ESWSPDACLGVKQMYKNLDLLSQLNERQERIMN 

EAKKLEKDLIDWTDGIAREVQDIVEKYPLEIKPP 

NQPLAAIDSENVENDKLPPPLQPQVYAG 


3238 


A 


1373 


449 


\a.SVCPTGVFRPAPCRMAFMKKYLLPILGLFMA 
YYYYSANEEFRPEMLQGKKVIVTGASKGIGREM 
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SEQ ID » 
NO: 


lethod 1 r 
b 
n 
1 
c 
t 
a 

I 

s 


redicted p 
eginning n 
udeotide I 
DCfltion c 
orrcsponding t 
0 first amino a 
rid residue of ( 
>eptide s 
equence 


redicted end / 
udeotide E 
Dcation 1 
orresponding 
oiast amino 'I 
cid residue of ? 
)eptide V 
equence 


.mino acid sequence (A=Alanine ^^^^^teine^^D^A^P^^^^^ 
'=ri..t*ifiiir Acid F^PIienylalan ne,G=Glydnc,H«Histidine, 
=Iso>eiicine, K=Lysine, L=Leucine, M-Methioniiie, 
<=Asparagine, P-Prolloe, Q=Glutamine, R=Arg.mne, S=SenDe, 

C=UnkBowii, *-Stop codoB, A=posslble nudeotide ddefaon, 
■^possible nucleotide insertion 










^YHLAKMGAHVWTARSKETLQKWSHCLELO 

^iASAHYlAGTMEDMTFAEQFVAQAGKLMGGLD 

NILE.NHITm^Ll^FHDDIHHVRKSMEVNFLSYV 

VLTVAALPMLKQSNGSIVWSSLAGKVAYPMVA 

AYSASKFALDGFFSSIRKEYSVSRVNVSITLCVLG 

LIDTETAMKAVSGIVHMQAAPKEECALEUK.GUA 

LRQEEVYYDSSLWmiJRNPCRKILEFLYSTSYN 

SJoi^lKVALNFIlFYLYNiaLW/QPLKKK'EA 


3239 


A 


213 


422 


HWYPDKPLKGSGFHT/GEMVDPVOaAAKRSGL 
TVFD 


3240 


A 


1255 


1425 


■HESYHVNPNLCNPVAFlSUAHSiG+KWPSWLOA 
VAmnNPSTLVGRGGRITRGOELR 


3241 


A 


161 


547 


PAGIGRSTAKTPGTPGSLHMENLKSGV i^PLKEAb 
GCPGADRNLLVYSFYEKGPLTFRDVAIEFSLEEW 
QCLDTAQQDLYRKVMLENYRNLVFLAGIAVSKP 
DT TTn F'7^''^T:PWTmKRHAMVD0PPGR 


3242 


A 


50 


241 


PLPARGKSTLPAlFCSPSAPliLASMaVVl'1'NKyQl 
nWPRGVTOFGNKYIQQTKPLTLERTINL 


3243 


A 


380 


702 


FVAYLKLPFFSQVCLFASSliMFFTISRKJ>JMSWi^:> 
LLLLVFGLIWGLMLLHYTPQQPBHQSSVKLREQI 
LDLSKRYVKALAEENKNTVDVENGASMAGYGK 

TTVRYF 


3244 


A 


37 


1391 


VLMDQRMMRSMRLREBHSPOPSHTASCLCUSAf 

CILCSCCPASRNSTVSRLIFTFFLFLGVLVSIIMLSP 

GVESOLYKLPWVCEEGAGIPTVLQGHIDCGSLLG 

YRAVYRMCFATAAFFFFFTLLMLCVSSSRDPRA 

AI0NGFWFFKFLE.VGLTVGAFYIPDGSFTN1WFY 

FGVVGSFLFILIQLVLLIDFAHSWNQRWLGKAEE 

CDSRAWYAGLFFFTLLFYLLSIAAVALMFMYYT 

EPSGCHEGKVFISLNLTFCVCVSIAAVLPKVQDA 

OPNSGLLQASVITLYTMFVTWSALSSIPEQKCNP 

HLPTQLGNETWAGPEGYETQWWDAPSIVGLnF 

LLCTLnSLRSSDHRQVNSLMQTEECPPMLDATQ 

QQQQVAACEGR^QDGVTYSY^^O^ 

ASLHVMMTLTNWYKPGETRKMISTWTAVWVKl 

CASWAGLLLYL 


3245 


A 


52 


426 


■ -SSiOTDDEBLSLAKDITQMFVASHRKMKAHi^v 
LTFLLLFVITSVASENASTSRGCGLDLLPQYVSLC 
DLDAIWGIWEAAAGAGALITLLLMLILLVRLPF 
Pf'FKi^'iTvr.T HFI FLLGTLGP 


3246 


A 


3 


515 


" ■ HEVCGSGCCCHCCAGGP V AkViKALPKLRO V Mo 
RFLNVLRSWLVMVSIIAMGNTLQSFRDHTFLYEK 
LYTGKPNLVNGLQARTFGIWTLLSSVIRCLCAIDl 
HNKTLYHTILWTFLLALGHFLSELFVYGTAAPTI 
GVLAPLMVASFSILGMLVGLRYLEVEPVSRQKK 


3247 


A 


1 


932 


- ERLCFPCMQSKIYSYMSPNKCSGMRi'PHjiifcMSV 
THHEVKCQGKPLAGIYRKREEKRNAGNAVRSA 
MKSEEOKIKDARKGPLVPFPNQKSEAAEPPKTPP 
SSCDSTOAAIAKQALKKPIKGKQAPRKKAQGKT 
OONRKLTDFYPVRRSSRKSKAELQSEERKRIDELI 
KGKEEGMKIDLIDGKGRGVIATKQFSRGDFVyE 
YHGDLIBITDAKKREALYAQDPSTGCYMYYFQY 
T QK- TYCVDATRBTNRLGRUNHSKCGNCQTKLH 
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SEQD) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C==Cysteine, D'=Aspartic Add, 
E^Glutamic Acid, F^Phenylalanine, G^GIycine, H'^^Histidlne, 
I'^lsoleucine, K-Lysine, L^Leucine, M=Metluonine, 
N^Asparagine, P^Proline, Q==GIutamme, R=Arginine, S=Serine, 
T^TIireonine, V=Valinc, W=Tryptophan, Y^Tyrosine, 
X^Unknown, *=Stop codon, A>possibie nucleotide deletion, 
^possible nucleotide insertion 










DIDGVPHLILIASRDIAAGEELLYDYGDRSKASIE 
AHPWLKH 


3248 


A 


3 


870 


PGSTISCSELKGTQCRATAGSRGRRPPMTCWLRG 

VTATFGRPAEWPGYLSHLCGRSAAMDLGPMRK 

SYRGDREAFEETHLTSLDPVKQFAAWFEEAVQC 

PDIGEANAMCIATCTRDGKPSARMLLLKGFGKD 

GFRFFTNFESRKGKELDSNPFASLVFYWEPLKRQ 

VRVEGPVKKLPEEEAECYFHSRPKSSQIGAWSH 

QSSVIPDREYLRKKNEELEQLYQDQEVPKPKSW 

GGYVLYPQVMEFWQGQTNRLHDRIVFRRGLPTG 

DSPLGPMTHRGEEDWLYERLAP 


3249 


A 


43 


1210 


TRVGRGESGLKMEVKPPPGRPQPDSGRRRRRRG 

EEGHDPKEPEQLRiaFIGGLSFETTDDSLREHFEK 

WGTLTDCVVMRDPQTKRSRGFGFVTYSCVEEV 

DAAMCARPHKVDGRVVEPKRAVSREDSVKPGA 

HLTVKKJFVGG^CEDTEEY^nLlaDYFEKYGKIETIE 

VMEDRQSGKKRGFAFVTFDDHDTVDKIVVQKY 

HTINGHNCEVKKALSKQEMQSAGSQRGRGGGS 

GNFMGRGGNFGGGGGNFGRGGNFGGRGGYGG 

GGGGSRGSYGGGDGGYNGFGGDGGNYGGGPG 

YSSRGGYGGGGPGYGNQGGGYGGGGGYDGYN 

EGGNFGGGNYGGGGNYNDFGNYSGQQQSNYGP 

MKGGSFGGRSSGSPYGGGYGSGGGSGGYGSKRF 


3250 


A 


32 


1175 


VAGRGDMAALRDAEIQKDVQTYYGQVLKRSAD 
LQTNGCVTTARPWKHIREALQNVHEEVALRYY 
GCGLVIPEHLENCWILDLGSGSGRDCYVLSQLVG 
• EKGHVTGiDMTKGQVEVAEKYLDYHMEKYGF - 
ASl^tFfflGYIEKLGEAGIKNESHDrk^SNCXaNL 
VPDKQQVLQEAYRVLKHGGELYFSDVYTSLELP 
EEIRTHKVLWGECLGGALYWKELAVLAQKIGFC 
PPRLVTANLITIQNKELERVIGDCRFVSATFRLFK 
HSKTGPTKRCQVIYNGGITGHEKELMFDANFTFK 
EGEIVEVDEETAAILKNSRFAQDFLIRPIGEKLPTS 
GGCSALELKDnTDPFKLAEESDSMKSRCVPDAA 
GGCCGTKKSC 


3251 


A 


32 


1175 


VAGRGDMAALRDAEIQKDVQTYYGQVLKRSAD 

LQTNGCVTTARPVPKHIREALQNVHEEVALRYY 

GCGLVIPEHLENCWILDLGSGSGRDCYVLSQLVG 

EKGHVTGIDMTKGQVEVAEKYLDYHMEKYGFQ 

ASNVTFIHGYIEKLGEAGIKNESHDIWSNCVINL 

VPDKQQVLQEAYRVLKHGGELYFSDVYTSLELP 

EEIRTHKVLWGECLGGALYWKELAVLAQKIGFC 

PPRLVTANLITIQNKELERVIGDCRFVSATFRLFK 

HSKTGPTKRCQVIYNGGITGHEKELMFDANFTFK 

EGEIVEVDEETAAILKNSRFAQDFLIRPIGEBCLPTS 

GGCSALELKDIITDPFJCLAEESDSMKSRCVPDAA 

GGCCG-TCKSC 


3252 


A 


1 


574 


PLGSNTAPALRVMVQAWYMDDAPGDPRQPHRP 

DPGRPVGLEQLRRLGVLYWKLDADKYENDPELE 

KIRRERNYSWMDHTICKDKLPNYEEKIKMFYEE 

HLHLDDEIRYILDGSGYFDVRDKEDQWIRIFMEK 

GDMVTLPAGIYHRFTVDEKNYTKAMRLFVGEPV 

WTAYNRPADHFEARGQYVKFLAQTA 


3253 


A 


2 


984 


ARAAAHCGICRLVRWWRKRRSVMGIQTSPVLLA 
SLGVGLVTLLGLAVGSYLVRRSRRPQVTLLDPNE 
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SEQU) 

I NO; 



Method 



3254 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
luence 



Predicted ena 
nacleofide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



968 



Amino add sequence (A=AIanine C=Cyste.ne, D^Aspartic^^^^^^ 
E=Glutamic Acid, F=Phenylalamne, G=Glycme, H=Histidine, 
I=Isoleucinc K=Lysinc, L-Leuclne, M=Meth]ionine, 
N=Asparaginc, P-Proline, Q=GIutam.ne, R=Arginme, S^nne. 
T^Threonine, V=VaIine, W-Tryptophan. Y-Tyrosine, 
X=:lJnknown;*=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 

S^^sSgslvirpyipvtsdedqgyvdl^^ 

KVYLKGVHPKFPEGGKMSQYLDSLKVGDVVEF 
RGPSGLLTYTGKGHFNIQPNKKSPPEPRVAKKLG 

'^liWEaQARYPNRFKLWFTLDHPPKD 

^yivoT Ar HPNLDKLGYSQKMRFTY 

LQSAGEGVTHVL ILLESPAKPVAAV 1^^^^^ 



DSwEENETmSAFTIQEYFAKRMAALKNK 

?QWWGSDISETQVE1^GKKRNK^ 
SYLQPKAKRHTEGKPERAEAQERVAKKKSAPAE 

EOLRGPCWDQSSKASAQDAGDHVQPA 



3255 



173 



3256 



3257 



3258 



439 



377 



1454 



113 



GSAAMKVKIKCWNGVATWL W VANDENCUlCR 
S^CPDCKVPGDDCPLVWGQCSHCFHMHC 

n HAnovoOHCPMCRQEWRFKE 



TAARRRQKGTAAK RRQKGTLHHV VLPPKSCRyF 



WfflSGrmSKVSFKITLTSDPRLPYKVLSVPKTP 
FTAVLKFAAEEFKVPAATSAnTODGIGINPAQTA 

mvJVFT .KHGSEL RIIPRDRVGSC 

GCSAAAAGAGSGPWAAQHKQI-PPA LLSFFIYNPR 

FGPREGQEENKILFYHPNEVEKNEKIRNyGL^AI 

VOmTFSPSKPAKSLHTQKNRQFF^ 

^S^IIEKQSKDGKPVffiYQEEELLDKWS^^ 

VL^CYSMYKLFNGTFLKAMEDGGVKLLKER^ 

^SwjTLHLQSCDLLDIFGGISFFPLDKMTY 

lSsfinbmeesimvkytaflyndqliwsgleq 

DDSKYL-rreLFPRHIEPELAGI^SPm^ 

GNLOHYGRFLTGPLNLNDPDAKCRFPKIFVNTO 

DTYEELHLIVYKAMSAAVCFMIDASVHPTLDFC 

?Sdsw(5qltvlasdiceqfninkrmsgsekep 

SSSS^LAEKSTVHMRKTPSVSLTSVOTD 

rifflLGDlNSDFmVDEDEEIIVKAMSDYWVVG 

^SSrm^QKNANLIEVNEEVKKLCATQF 



1558 



nniffld 



' APRGCSMPHRKKKPFmKJ^VS FHLVHKaWKU 
^AADESAPQRVLLPTQKJDNEERRAEQRKYGVF 
ro^^QH^EPSGPSELIPSSTTSAHNRREEK 

S^wSgklpssvfasefeedvgllnkaapv 
sgprldfdpdivaaldddfdfddpdnlleddfil 

OANKATGEEEGMDIQKSE^ffiDDSE^^^DVDDE^ 

gds^dydsagllsdedcmsvpgkihraiadhl 
SStcysmtssvmrrneqltlhdebje 
Syeqtoddeigaldnaelegsiqvdsnrlqevl 
^Skaencvklntlepledqdlpmneldes 
S^twleeakekwdcesicstysnlyw 

LS^KPKQIWSSKTGIPLNNa.PKKGLTAKQTE 

wqnS^gsdlpkvstqprsknesked^QAi 
Skerrvekkanklafklekrrqekellnlk 

knveglkl 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nncleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide . 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, I^Aspartic Add, 
E^GIutaniic Add, F=Pbenylalanine, G=Glydne, H^Histidine, 
I=Isoleudne, K=Lysine, L=Leudne, M=Methionine, 
N=:Asparagine, P=ProIine, Q^Iutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unknow.ni *^top codon, ^=^ssible nDcleotide deletion, 
V^possible oudeotide insertion 


3259 


A 


3 


964 


QMEPGNDTQISEFLLLGFSQEPGLQPFLFGLFLSM 

YLVTVLGNLLIILATISDSHLHTPMYFFLSNLSFA 

DICVTSTTIPKMLMMQTQl^VITmCLMQMyF 

FIU^AGFENFLLSVMAYDRFVAICHPLHYMVIMN 

PHLCGLLVLASWTMSALYSLLQILMVVRLSFCT 

ALEIPHFFCELNQVIQLACSDSFLNHMVIYFTVAL 

LGGGPLTGILYSYSKnSSIHAISSAQGKYKAFSTC 

ASHLSWSLFYGAILGVYLSSAATEINSHSSATAS 

VMYTWTPMLNPFIYSLRNKBIKRALGIHLLWGT 

MKGQFFKKCP 


3260 


A 


34 

* 


2573 


IPFLKSCCCCCLFDFPPPPLDQVQEEECEVERVTE 

HGTPKPFRKFDSVAFGESQSEDEQFENDLETDPP 

NWQQLVSREVLLGLKPCEIKRQEVINELFYTERA 

HVRTLKVLDQVFYQRVSREGILSPSELRKIFSNLE 

DILQLmGLNEQMKAVRKRNETSVIDQIGEDLLT 

WFSGPGEEKLKHAAATFCSNQPFALEMIKSRQK 

KDSRFQTFVQDAESNPLCRRLQLKDIIPTQMQRL 

TKYPLLLDNIAtYTEWPTEREKVKKAADHCRQIL 

NYVNQAVKEAENKQRLEDYQRRLDTSSLKLSEY 

Pm^ELRNLDLTKRKMIHEGPLVWKVNRDKTID 

LYTLLLEDELVLLQKQDDRLVLRCHSKILASTAD 

SKHTFSPVIKLSTVLVRQVATDNKALFVISMSDN 

GAQIYELVAQTVSEKTVWQDLICRMAASVKEQS 

TKPIPLPQSTPGEGDNDEEDPSBCLKEEQHGISVTG 

LQSPDRDLGLESTLISSKPQSHSLSTSGKSEVRDL 

FYAERQFAKEQHTDGTLKEVGEDYQIAPDSHLP 

VSEERWALDALRNLGLLKQLLVQQLGLTEKSVQ 

EDWQHFPRYRfASQGPQTDSVIQNSENIKAYHSG 

EGHMPFRTGTGDIATCYSPRTSTESFAPRDSVGL 

APQDSQASMLVMDHMIMTPEMPTMEPEGGLDD 

SGEHFFDAREAHSDENPSEGDGAVNKEEKDVNL 

RISGNYLILDGYDPVQESSTDEEVASSLTLQPMT 

GBPAVESTHQQQHSPQNTHSDGAISPFTPEFLVQQ 

RWGAMEYSCFEIQSPSSCADSQSQIMEYIHKIEA 

DLEHLKKVEESYTILCQRLAGSALTDKHSDKS 


3261 


A 


1 


2100 


AVEFAEGALTMAPWPELGDAQPNPDKYLEGAA 

GQQPTAPDKSKETNKTDNTEAPVTKIELLPSYST 

ATLIDEPTEVDDPWNLPTLQDSGIKWSERDTKGK 

ILCFFQGIGRLE.LLGFLYFFVCSLDBLSSAFQLVG 

GKMAGQFFSNSSIMSNPLLGLVIGVLVTVLVQSS 

STSTSIVVSMVSSSLLTVRAAIPIIMGANIGTSITNT 

IVALMQVGDRSEFRRAFAGATVHDFFNWLSVLV 

LLPVEVATHYLEIITQLIVESFHFKNGEDAPDLLK 

VITKPFTKLIVQLDKKVISQIAMNDEKAKNKSLV 

KIWCKTFTNKTQINVTVPSTANCTSPSLCWTDGI 

QNWTMKNVTYKENIAKCQHIFVNFHLPDLAVGT 

ILLILSLLVLCGCLIMIVKILGSVLKGQVATVrKKT 

INTDFPFPFAWLTGYLAILVGAGMTFrVQSSSVFT 

SALTPLIGIGVITmRAYPLTLGSNIGTTTTAILAAL 

ASPGNALRSSLQIALCHFFFNISGILLWYPIPFTRL 

PIRMAKGLGNISAKYRWFAVFYLIIFFFLIPLTVFG 

LSLAGWRVLVGVGVPWFfflLVLCLRLLQSRCPR 

X^PKKLQNWNFLPLWMRSLKPWDAVVSKFTGC 

FQMRCCCCCRVCCRACCLLCGCPKCCRCSKCCE 

DLEEAQEGQDVPVKAPETFDNITISREAQGEVPA 
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I SEQID I Method 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



1377 



3264 



A 1 



1398 



3265 



3266 



A 1265 



3267 



A 1802 



862 



S84 



1011 



Amino acid sequence (A-Alauin e C=Cysteine, i^-^spar»c /.wdT 

I=Isoleucine, K=Lysinc, I/=Leuanc, M-Mcthioninc, 
N=Aspararine,P=Proline, Q=Glutamine, R=Arginine, S^Serme, 
T4hrSVv-Valine,\V=Tiyptophan,Y=^^^^^ 
lZnZ^n.*^to^ codon,/=possible nucleotide deletion, 
\=possible nucleotide insertion 



SylaJ^glvenllvicvnwrgsqraglmn 
JSsSdlgivlslpvwmlevildytwlwg 

IfvHLLYFFYDVIDCFSMLHCVINPILYNFLSPHF 

SSava^Iylpkdq-ikagtcasssscstqhsi 
SSqpSaphpepslsfqahhllpntspisp 



GVGySrffVWYDASDQGLYSAPRVWWMFRAFG 

hhavslldgglrhwlrqnlplssgksqpapaef 



noigakfwevisdehgidptgsyhgdsdlqleri 
?^^^Sgnkywrailvdlepgti^sv^^^^ 

FGOffRPDNFVFGQSGAGNNWAKGHYTEGAELV 

S?SvvrLsescdclqgfqlthslgggtgsg 

S^LrKIREEYPDRIMNIFSV*^SP™ 
PYNAILSWQLVENTDETYSIDNEALTOICFRTL 

kltwtygdlnhlvsatmsgvttclrfpgqlna 
Srixavnmvpfprlhffmpgfapltsrgsqqy 
SSw^mfdsknmmaacdprhgryltv 

^IFRGRMS^KEVDEQMLNVQNKNSSYFVEW 

i^VKTAVCDIPPRGLmSA 
SH5FTAMFRRKAFLHWYTGEGMDEMEFTEAES 

vQBvnovnDATADEQGEFEEEEGEDEA 



^I^^DARVUiPtHPEEEGHWVMiPbbOAKAOTG 
MLEMLDSLLAIX3GLVLLRDSVEWEGRSLLKAL 

vSlcgeqvhilgcevseeefregfdsdinnr 

^^SZSK1EEAFPGGPLGM.RAMCK 
RTOPVPVnALDSLSWLLLRLPCTTLCQVLHAVS 

hodscpgetppslfplihlplprsvplflstle 



WAmVYFNAFWTIPVYKPMHDFLKYDFFQT 

msvigglllvvalgpggvsmdei^^w____ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
locatioii 
corresponding 
to last amino 
add residue of 
pcpnuc 
sequence 


Amino acid sequence (A«AJanine OCysteine, ]>=A5partic Acid, 
JE^vlUiamiC AClU) r— rneuyiaianiuCy \y=viiycinCf Ji-^aisnoiuC} 
I^Isolencine, K^Lysinc, LF'Leucine, M^Metbionine, 
N»Asparagine, P^Proline, Q^^utamine, R-Arginine, S=Serine, 
T»Tlir«oninc, V=Valine, W^Tiyptophan, Y^Tyrosine, 
X==Unknowny *=^top codon, A^^ssible nucleotide ddetion, 

\?3nnccihlp niirlpntidfi insertion 










LCFVFGTAALSIRSMDVLSLFLEHGKLVFASGLSP 
RA 


3268 


A 


490 


679 


EDAWITNPSLSNARSTPSKPLCYTVLKEGQVVGV 
KTTKASNTREKLRPESERRMVKSFGDEVT 


3269 


A 


2 


796 


GSTHASGARPSLKRARSQRGRPLPSRALPSAHKD 

MTTNAGPLHPYWPQHLRLDNFVPNDRF 

GLFSVTGVLWTTWLLSGRAAVVPLGTWRRLSL 

CWFAVCGFIHLVIEGWFVLYYEDLLGDQAFLSQ 

LWKEYAKGDSRYILGDNFTVCMETITACLWGPL 

SLWWIAFLRQHPLRFILQLWSVGQIYGDVLYF 

LTEHRDGFQHGELGHPLYFWFYFVFMNALWLV 

LPGVLVLDAVKHLraAQSTLDAKATKAKSKKN 


3270 


A 


17 


229 


GDTGPQILMSYLDSVASKLLQMVKKLSQSFCSNF 
KYLTKYSRKQVSDEIKKSRRTVESNPIFFKKNKKI 

Q 


3271 


A 


419 


553 


IQSGLSLCFADLSETPEGRAGVPGCPHSCDGVAS 
GRPCSPSSAG 


3272 


A 


1211 


1450 


FQFIQIELLNILQSLIRNQTQSPYNTTAYPAIDSVIT 
ILPFSFSCFFIITKCFGLSIFPSVIFFLHVYFILTLVVF 
YCC 


3273 


A 


59 


1562 


QAWSLQVALSPFFFPASPSNSFAAAVPQLLFPELP 
LPHVPGQESAKRRSARRFLMSELTKELMELVW 
GTKSSPGLSDTEFCRWTQGFVFSESEGSALEQFEG 
GPCAVIAPVQAFLLKKLLFSSEKSSWRDCSQEEQ 
KELLCHTLCDILESACCDHSGSYCLVSWLRGKTT 
EETASISGSPAESSCQVEHSSALAVEELGFERFHA 
• MQKRSFRSLPELKD AVEDQYSMWi3N^ 
LYSVLLTKGIENIK^^EffiDASEPLIDPVYGHGSQS 
LINLLLTGHAVSNVWDGDRECSGNIKLLGIHEQA 
AVGFLTLMEALRYCKVGSYLKISKIPYLDCLASE 
THLTVFFAKDMALVAPEAPSEQARRVFQTYDPE 
DNGFIPDSLLEDVMKJ^LDLVSDPEYINLMKNKL 
DPEGLGIILLGPFLQEFFPDQGSSGPESFTVYHYN 
GLKQSNYNEKVMYVEGTAWMGFEDPMLQTD 
DTPIKRCLQTKWPYIELLWTTDRSPSLN 


3274 


A 


186 


1358 


RVVHRFFKSSAFWPAEVKQPRGGPKTGSRKEGA 

GSRAPQPWRSFCGSVGAEGRMEKLRLLGLRYQ 

EYVTRHPAATAQLETAVRGFSYLLAGRFADSHE 

LSELVYSASNLLVLLNDGILRKELRKKLPVSLSQ 

QKLLTWLSVLECVEVFMEMGAAKVWGEVGRW 

LVIALIQLAKAVLRMLLLLWFKAGLQTSPPIVPL 

DRETQAQPPDGDHSPGNHEQSYVGKRSNRWRT 

LQNTPSLHSRHWGAPQQREGRQQQHHEELSATP 

TPLGLQETIAEFLYIARPLLHLLSLGLWGQRSWK 

PWLLAGWDVTSLSLLSDRKGLTRRERRELRRR 

TELLLYYLLRSPFYDRFSEARE-FLLQLLADHVPG 

VGLVTRPLMDYLPTWQKIYFYSWG 


3275 


A 


575 


759 


SVYSASSCKCCNYRKTEQIPDCEQPPASSMPERPS 
HESQPTPQMMPLSAPSRAEELGQRPG 


3276 


A 


7 : 


258 


KAAGHRLLLAAGHPSMPSSDCLLWEGSLELRPL 
QHISSLLVLVSTTCLFAFPRVPIAFESKSCLIYHCH 
CAFTVRHYMCSSHTG 


3277 


A 


9 


2221 


KLGVEPEEEGGGDDEEDAEAWAMELADVGAAA 

SSQGVHDQVLPTPNASSRVIVHVDLDCFYAQVE 

MISNPELBCDKPLGVQQKYLWTCNYEARKLGVK 
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I SEQID I Method 
NO: 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 



876 



3279 



82 



2929 



Amino acid sequence (A=Alanlne C-Cystone, B^Asparoc AC.a. 
E=Glutainic Acid, F=Phenytatanine, &=GIyane, H=Histid.ne, 
I=Isoleudne,K=Lysine,I^=l^cine,M=ftfcthionine, 
N=Asparagiiie, P=ProUne, Q-Glutamlne, R=Argimne, S=Serine, 
T=TlireoBine, V=VaUne, W=TryptophaB, Y=Tyrosine, 
X=OnkBown, *=Slop codon, A»po8sible nucleotide deletion, 
Vppossible nudeotide insertion 



KLMNVRDAKJdKCPQLVLVNGbUL IRy KtMSYK 

VTELLEEFSPWERLGFDENFVDLTEMVEKRLQQ 

LQSDELSAVTVSGHVYNNQSINLLDVLHIRLLVG 

SOIAAEMREA^dYNQLGLTGCAGVASNKLLAKL 

VSGVFKPNQQTVLLPESCQHLfflSLNHKEIPQIG 

YKTAKCLEALGINSVRDLQTFSPKILEKELGISVA 

ORIQKLSFGEDNSPVILSGPPQSFSEEDSFKKCSSE 

Veaknoeellasllnrlcqderkphtvrliirry 

SSEKHYGRESRQCPIPSHVIQKLGTGNYDVMIPM 

VDILMKLFRmiVNVKMPFHLTLLSVCFCNLKAL 

NTAKKGLIDYYLMPSLSTTSRSGKHSFKMKDTH 

MEDFPKDKETNRDFLPSGRIESTRTRESPLDTTNF 

SKEKDINEFPLCSLPEGVDQEVFKQLPVDIQEEIL 

SGKSREKFQGKGSVSCPLHASRGVLSFFSKKQM 

ODIPINPRDHLSSSKQVSSVSPCEPGTSGFNSSSSS 

YMSSQKDYSYYLDNRLKDBRISQGPKEPQGFHF 

TNSNPAVSAFHSFPNLQSEQLFSRNHTTDSHKQT 

VATDSHEGLTENREPDSVDEKITFPSDIDPQVFYE 

T PF A VOKE LLAEWKRTGSDFHIGHK 

GLRLHVDLVEKPRTGIMAAETRNVAQAbAl'fPQ 

kryyrqrahsnpmadhtlrypvkpeemdwsel 
ypeffapltqnqshddpkdkkekraqaqvefad 

IGCGYGGLLVELSPLFPDTLILGLBIRVKVSDYVQ 

driralraapaggfqniaclrsnamkhlpnffy 

KGOLTKMFFLFPDPHFKRTKHKWRnSPTLLAEY 

ayvlrvgglvytitdvlelhdwmcthfeehplf 
ervpledlsedpvvghlgtsteegkkvlrnggk 

TJTOAtFftlO ODPVLOAVTSQTSLPGH 

TRTKRRLGREKAMASPPRGWGCGliULl.PFSE.LQ 



TLCEPGSGQIRYSMPEELDKGSFVGNIAKDLGLE 
POELAERGVRIVSRGRTQLFALNPRSGSLVTAGRI 
DREELCAQSPLCWNFNILVENKMKIYGVEVEn 

dindnfprfrdeelkvkvnenaaagtrlvlpfa 

RDADVGVNSLRSYQLSSNLHFSLDWSGTDGQK 

ypelvleqpldreketvhdllltaldggdpvlsg 

TTHIRVTVLDANDNAPLFTPSEYSVSVPENIPVGT 

rllmltatdpdegingkltysfkneeekisetfql 

dsnlgeistlqsldyeesrfylmewaqdggal 

vasakvwtvqdvndnapeviltsltssisedcl 

pgtvl\lfsvhdgdsgengeiacsiprnlpfklek 

svdnyyhllttrdldreetsdynitltvmdhgt 

PPLSTESfflPLKVADVNDNPPNFPQASYSTSVTEN 

nprgvsifsvtahdpdsgdnarvtyslaedtfqg 

aplssyvsinsdtgvlyalrsfdyeqlrdlqlwv 

tasdsgnpplssnvslslfvldqndntpeilypal 

ptdgstgvelaprsaepgylvtkvvavdkdsgq 

nawlsyrllkasepglfavglhtgevrtarall 

drdalkqslvvavedhgqpplsatftvtvavad 

ripdiladlgsiktpidpedldltlylwavaavs 

cvflafvivllvlrlrrwhksrllqaegsrlaq 

vpashfvgvdgvraflqtyshevsltadsrksh 

lifpqpnyadtllseesceksepllmsdkvdank 

eerrvqqappntdwrfsqaqrpgtsgsqngddt 

gtwpnnqfdtemlqamilasaseaadgsstlgq 

ftAfiTMGLSARYGPQFTLQHVLQGELGSDYRQN 
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SEQIB 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystcine, D=Aspartic Acid, 
EXIIutamic Acid, F=Phenylalanine, G»Giycine, H^HIstidine, 
I=l50leucine, K^Lysine, I/^Leucine, M-Methionlne, 
N^^Asparaglne, P^Prollne, Q=Giutamlne, R^Arginine, S^Serine, 
T»Threonlne, V«Valine, W-Tryptophan, V«Tyroslne, 
X=Unluiown, *==Stop codon, /"possible nucleotide deletion, 
V=possible nucleotide insertion 










VYIPGSNATLTNAAGKRIX3KAPAGGNGNKKKS 
GKKEKK 


3280 


A 


149 


1288 


GTSQMSSHKGSVVAQGNGAPASNREADTAELAE 

LGPLLEEKGKRVIANPPKAEEEQTCPVPQEEEEE 

VRVLTLPLQAHHAMEBOVIEEFVYKVWEGRWRVI 

PYDVIi>DWLKD>nDYLLHGHRPPMPSFRACFKSIF 

miBTGNIWTHLLGFVLFLFLGILTMLRPNMYF 

MAPLQEKVVFGMFFLGAVLCLSFSWLFHTVYCH 

SEKVSRTFSKLDYSGIALLIMGSFVPWLYYSFYCS 

PQPRLIYLSTVCVLGISAIIVAQWDRFATPKHRQT 

RAGVFLGLGLSGWPTMHFTIAEGFVKATTVGQ 

MGWFFLMAVMYITGAGLYAARIPERFFPGKFDI 

WFQSHQIFHVLWAAAFVHFYGVSNLQEFRYGL 

EGGCTDDTLL 


3281 


A 


1 


557 


RPRRRQPSFSCRVLVLEDPPCFRFTNSMNQEKLA 

KLQAQVRIGGKGTARRKKKVVHRTATADDKKL 

QSSLKKLAVNNIAGffiEVNMIKDDGWIHFNNPK 

VQASLSANTFAITGHAEAKPITEMLPGILSQLGAD 

SLTSLRKLAEQFPRQVLDSKAPKPEDIDEEDDDV 

PDLVENFDEASKNEAN 


3282 


A 


155 


1139 


HALGRRGGSQELSAAACGCFALRLRAPGSGRPA 

LAPGAAAFAGLGGAPRFPPRGSAAGRTMLLKEY 

RICMPLTVDEYKIGQLYMISKHSHEQSDRGEGVE 

WQNEPFEDPHHGNGQFTEKRVYLNSKLPSWAR 

AVVPKIFYVTEKAWNYYPYTITEYTCSFLPKFSIH 

lETKYEDNKGSNDTIFDNEAKDVEREVCFIDIACD 

EmERYYKESEDPKHFKSEKTGRGQLREGWRDSH ' 

QPlMCSYKLVTVBCFEVWGLQtRVEQFVHKVVR 

DILLIGHRQAFAWVDEWYDMTMDDVREYEKN 

MHEQTNIKVCNQHSSPVDDIESHAQTST 


3283 


A 


159 


547 


DCSKLNQQVEVQESEWRLTEAKGPTMGKESGW 
DSGRAAVAAVVGGWAVGTVLVALSAMGFTSV 
GIAASSIAAKMMSTAAIANGGGVAAGSLVAILQS 
VGAAGLSVTSKVIGGFAGTALGAWLGSPPSS 


3284 


A 


227 


637 


TSNSLLRPDRMSVMDLANTCSSFQSDLDFCSDCG 

SVLPLPGAQDTVTCIRCGFNINVRDFEGKVVKTS 

WFHQLGTAMPMSVEEGPECQGPVVDRRCPRCG 

HEGMAYHTRQMRSADEGQTVFYTCTNCKFQEK 

EDS 


3285 


A 


123 


1535 

• 


HRLSYDEAFAMANDPLEGFHEVNLASPTSPDLL 

GVYESGTQEQTTSPSVIYRPHPSALSSVPIQANAL 

DVSELPTQPVYSSPRRLNCAEISSISFHVTDPAPCS 

TSGVTAGLTKLTTOKDNYNAEREFLQGATITEAC 

DGSDDIFGLSTDSLSRLRSPSVLEVREKGYERLKE 

ELAKAQRELKLKDEECERLSKVRDQLGQELEEL 

TASLFEEAHKMVREANIKQATAEKQLKEAQGKI 

DVLQAEVAALKTLVLSSSPTSPTQEPLPGGKTPF 

KKGHTRNKSTSSAMSGSHQDLSVIQPIVKDCKEA 

DLSLYNEFRLWKDEPTMDRTCPFLDKIYQEDIFP 

CLTFSKSELASAVLEAVENNTLSIEPVGLQPIRFV 

KASAVECGGPKKCALTGQSKSCKHRIKLGDSSN 

YYYISPFCRYRITSVCNFFTYIRYIQQGLVKQQDV 

DQMFWEVMQLRKEMSLAKLGYFKEEL 


3286 


A 


3 


589 


GPSQSMAAGELEGGKPLSGLLNALAQDTFHGYP 
GITEELLRSQLYPEVPPEEFRPFLAKMRGILKSIAS 
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SEQID A 
NO: 


iethod 1 r 
b 
n 
1 

t 

1 s 
1 

Is 


redicted Tl 
egmning n 
ucleotide 1 
ocation c 
orresponding t 
0 first amino 2 
icid residue of i 
)eptide s 
eqgence 


redicted end ^ 
ucleotide 1 
ocation 1 
orresponding P 
0 last amino 
icid residne of ^ 
)eptide N 
.equence 


,mino acid sequence (A=AIaninc C^Cysteine, ^^^^^^j;^^'"^ 
'=riiit9iiiic Add, F=PhenYlaIaninc, G=Glycinc, H=Hlstidine, 
=Isolcucine,K=Lysinc,I^Lcucinc,M=Methionme, 
4-Asparagine, P-ProUne, Q=Glutaminc. R-Arginlne, S=Serine, 
r^Thrconine, V=VaUne, W-Tryptophan, Y=Tyrosine, 
C=Unknown, *»Stop codon,HiossiWe nucleotide deletion, 
=»posstble nucleotide insertion 










ADMDFNQLEAFLTAQTKK-QCiGITSDQAAVlbKf 
WKSHKTKIRESLMNQSRWNSGLRGLSWRVDGK 
SQSRHSAQIHTPVAIIELELGKYGQESEFLCLlitU 
F.VKVN01LKTLSEVEESISTL1S0PN 


3287 


A 


50 


390 


LGAMAKHHPDLIFCRKQAGVAIGRLCEKCDCiKU 
VICDSYVRPCILVRICDECNYGSYQGRCVICGGP 
GVSDAYYCKECTIQEKDRDGCPKIVNLGSSKTDL 

FYERKKYGFKKR 


3288 


A 


3 


428 


RTTFFRFRPCESLCGUMKLLTHNLLSSHVRbVOS 
RGFPLRLQATEVRICPVEFNPNFVARMIPKVEWS 
AFLEAADNLRLIQVPKGPVEGYEENEEFLRTMH 
HLLLEVEVIEGTLQCPESGRMFPISRGIPNMLLSE 

EETES 


3289 


A 


1 


1743 


AGCCRDTRFPTPRQPGSLCHNFCRSAACl V iKfl 

HGSPREDTGTPRSREMMFQDSVAFEDVAVSFTQ 

EEWALLDPSQKNLYRDVMQETFKNLTSVGKTW 

KVONIEDEYKNPRRNLSLMREKLCESKESHHCG 

ES^QIADDMLNRKTLPGITPCESSVCGEVGTGH 

SSLNTHERADTOHKSSEYQEYGENPYRNKECKK 

AFSYLDSFQSHDKACTOEKPYDGKECTETFISHS 

CIQRHRVMHSGDGPYKCKFCGKAFYFLNLCLIH 

ERIHTGVKPYKCKQCGKAFTRSTILPVHERTHTG 

VNADECKECGNAFSFPSEIRRHKRSHTGEKPYEC 

KOCGKVFISFSSIQYHKMTHTGEKPYECKQCGK 

AFRCGSHLQKHGRTHTGEimECRC^G^ 

SDLQRHEKTHTEDKPYGCKQCGKGFRCASQLQI 

HERTHSGEKPHECKEGGKVFKYFSSLRIHERTHT 

GEKPHECKQCGKAniYFSSLHIHERlHrGDKPYE 

CKVCGKAFTCSSSIRYHERTHTGEKPYECKHCGK 

AFISNYIRYHERTHTGEKPYQCKQCGKAFIRASS 

CREHERTHTINR 


3290 


A 


2 


1350 


GRPRSSSD>5KNFLRERAGLSSAAVQTRiUNSAAb 

RRSPAARPPVPAPPALPRGRPGTEGSTSLSAPAVL 

WAVAVWVVVSAVAWAMANYIHVPPGSPEVP 

KLNVTVQDQEEHRCREGALSLLQHLRPHWDPQE 

VTLOLFTDGITNKLIGCYVGNTMEDWLVRIYGN 

KTEIXVDRDEEVKSFRVLQAHGCAPQLYCTFNN 

GLCYEFIQGEALDPKHVCNPAIFRLIARQLAKIHA 

IHAHNGWKSNLWLKMGKYFSLIPTGFADEDIN 

KRFLSDIPSSQn^EEMTWMKEILSNLGSPVVLCH 

NDLLCKNIIYNEKQGDVQFIDYEYSGYNYLAYDI 

GNHFNEFAGVSDVDYSLYPDRELQSQWLRAYLE 

AYKEFKGFGTEVTEKEVELLFIQVNQFALASHFF 

WGLWALIQAKYSTffiFDFLGYAIVRFNQYFKMK 

PEVTALKVPE 


3291 


A 


102 


839 


- PEAQTSAVLAREKGHLPTMKHEAPMQMA5AWU 

RYADDSFTSAFVSTVGroFKVKTVFKNEKRIKLQl 

WDTAGQERYRTITTAYYRGAMGFILMYDITNEE 

SFNAVQDWSTQIKTYSWDNAQVE.VGNKCDME 

DERVISTERGQHLGEQLGFEFFETSAKDNINVKQ 

TFERLVDnCDKMSESLETDPAITAAKQNTRLKET 

PPPPOPNCAC 


3292 


A 


2 


4136 


— DRPPWNSRVDDFVTNLlHLSSKGHlSFAI<JjrSLQ 
nuTPAEMSPVLHFYVRPSGHEGAASGHTRRKLQ 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
. sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residae of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
&K>Iutamic Acid, F=Phenylalanine, G=Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N»Asparagine, P=Proiinc, O^Glutamine, R=Arginine, S^erine, 
T»Threonine, V=Vallnc W=Tryptophan, Y=Tyrosine, 
X=Unknown» *«Stop codoOyA'possible nucleotide deletloD, 
V=possible nucleotide insertion 










GKLPELQGVETELCYNVNWTAEALPSAEETKKL 
MWLFGCPLLLDDVARESWLLPGSNDLLLEVGPR 
LNFSTPTSThm^SVCRATGLGPVDRVETTRRYRLS 
FAHPPSAEVEAIALATLHDRMTEQHFPHPIQSFSP 
ESMPEPLNGPINILGEGRLALEKANQELGLALDS 
WDLDFYTKRFQELQRNPSTVEAFDLAQSNSEHS 
RHWFFKGQLHVDGQKLVHSLFESIMSTQESSNP 
NNVLKFCDNSSAIQGKEVRFLRPEDPTRPSRFQQ 
QQGLRHVVFTAETHNFPTGVCPFSGATTGTGGRI 
RDVQCTGRGAHVVAGTAGYCFGNLHIPGYNLP 
WEDLSFQYPGNFARPLEVAffiASNGASDYGNKF 
GEPVLAGFARSLGLQLPDGQRREWIKPIMFSGGI 
GSMEADfflSKEAPEPGMEWKVGGPVYRIGVGG 
GAASSVQVQGDNTSDLDFGAVQRGDPEMEQKM 
NRVIRACVEAPKGNPICSLHDQGAGGNGNVLKE 
LSDPAGAIIYTSRFQLGDPTLNALEIWGAEYQESN 
ALLLRSPNRDFLTHVSARERCPACFVGTTTGDRRI 
VLVDDRECPVRRNGQGDAPPTPPPTPVDLELEW 
VLGKMPRKEFFLQRKPPMLQPLALPPGLSVHQA 
LERVLRLPAVASKRYLTNKVDRSVGGLVAQQQC 
VGPLQTPLADVAVVALSHEELIGAATALGEQPV 
KSLLDPKVAARLAVAEALTNLVFALVTDLRDVK 
CSGNWMWAAKLPGEGAALADACEAMVAVMA 
ALGVAVDGGKDSLSMAARVGTETVRAPGSLVIS 
AYAVCPDITATVTPDLKHPEGRGHLLYVALSPG 
QHRLGGTALAQCFSQLGEHPPDLDLPENLVRAFS 
■ ITQGLLKDRLLCSGHDVSDGGLVTCLLEMAFAG ^ 
NCGLQVDVPVPRVDVLSVLFAEEPGLVLEVQEP 
DLAQVLKRYRDAGLHCLELGHTGEAGPHAMVR 
VSVNGAVVLEEPVGELRALWEETSFQLDRLQAE 
PRCVAEEERGLRERMGPSYCLPPTFPKASVPREP 
GGPSPRVAILREEGSNGDREMADAFHLAGFEVW 
DVTMQDLCSGAIGLDTFRGVAFVGGFSYADVLG 
SAKGWAAAVTFHPRAGAELRRFRKRPDTFSLGV 
CNGCQLLALLGWVGGDPNEDAAEMGPDSQPAR 
PGLLLRHNLSGRYESRWASVRVGPGPALMLRG 
MEGAVLPVWSAHGEGYVAFSSPELQAQIEARGL 
APLHWADDDGNPTEQYPLNPNGSPGGVAGICSC 
DGEIHLAVMPHPERAVRPWQWAWRPPPFDTLTT 
SPWLQLFINARNWTLEGSC 


3293 


A 


65 


642 


GVRGFWAGTMASRAGPRAAGTDGSDFQHRERV 

AMHYQMSVTLKYEEKKLIYVHLVIWLLLVAKMS 

VGHLRLLSHDQVAMPYQWEYPYLLSILPSLLGLL 

SFPRNNISYLVLSMISMGLFSIAPLIYGSMEMFPA 

AQQLYRHGKAYRFLFGFSAVSIMYLVLVLAVQV 

HAWQLYYSKKLLDSWFTStQEKKHK 


3294 


A 


35 


1821 


SQRSCPRSPSSPAPPWARCSNPDSRTGGVPVPRA 

WSAGGPALGLMAAPVRLGRKRPLPACPNPLFVR 

WLTEWRDEATRSRHRTRFVFQKALRSLRRYPLP 

LRSGKEAKILQHFGDGLCRMLDERLQRHRTSGG 

DHAPDSPSGENSPAPQGRLAEVQDSSMPVPAQP 

KAGGSGSYWPARHSGARVILLVLYREHLNPNGH 

HFLTKEELLQRCAQKSPRVAPGSARPWPALRSLL 

HRNLVLRTHQPARYSLTPEGLELAQKLAESEGLS 

LLNVGIGPKEPPGEETAVPGAASAELASEAGVQQ 
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1 SEQID 1 r 
NO: 1 


VIethod 1 


Predicted J 

nucleotide 1 
location * 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end / 
nucleotide 
ocatiOD 
corresponding 
to last andno 
acid readue of 
peptide ^ 
sequence 


Giino acid sequence (A^Alanine C«Cysteine, D=Aspartic Acid, 
E-Glutamic Acid, F^Phenylalanine, G=Glyclne, H=Histidine, 
Nlsoleucine,K=Lyslne,L=L€ucine,M=MctIiioiiine, 
V=*Asparagine, P^Proline, Q=GIutamlne, R=Argmine, S=Sennc 
r=Threonine, V=VaUne, W=Tryptophan, Y^Tyrosinc, 
X«Unlmown, *=Stop codon, /^possible nucleotide deletion, 
^isposstble nucleotide insertion 










QPLELRPGEYRVLLCVDlGKiUGGGHRPELLREL^ 

QiaHVTHTVRKLHVGDFVWVAQETNPRDPAOT 

GELVLDffiVERKRLDDLCSSnDGRFREQKFRLKR 

CGLERJRVi J-» V JixiJtlvjo v rUNjLiOAjr a ij^s;^ y x x 

VIDGFFVKRTADDKESAAYLALLTRGLQRLYQGH 

TLRSRPWGTPGNPESGAMTSPNPLCSLLTFSDFN 

AGAIKNKAQSVREVFARQLMQVRGVSGEKAAA 

LVDRYSTPASLLAAYDACATPKEQETLLSmCG 

BmRNLGPALSRTLSQLYCSYGPLT 


3295 


A 


2 


1115 


EFHPHTXJVSGLLTPQLQEPDVWSPSRUyJb'VSLHL 

PGKGAPEVKEMAWWKSWIEQEGVTVKSSSHFN 

PDPDAETLYKAMKGIGTNEQAIIDVLTKRSNTQR 

QQIAKSFKAQFGKDLTETLKSELSGKFERLIVAL 

MYPPYRYEAKELHDAMKGLGTKEGVIIEILASRT 

KNQLRElMKAYfiiiU I uooJOJioi^iv^™-' J- * ^t-"^ 

LVCLLQGSRDDVSSFVDPALALQDAQDLYAAGE 

mGTDEMKFimCTRSATHLLRVFEEYEKlANK 

SIEDSDCSETHGSLEEAMLTVVKCTQNLHSYFAE 

RLYYAMKGAGTRDGTLIRNIVSRSEIDLNLIKCH 

FKKMYGKTLSSMIMEDTSGDYKNALLSLVGSDP 


3296 j 


A 


1 


838 


GTRGGVGPGDNGGVEAGAKPGAAAIPLROUUS 

GETGPGRVAPGEVRGSPRGHVAGPEGPREVLFFF 

FLPSSKPASEVINEYSWKVDFLKGMLQAEKLTSS 

SEKALANQFLAPCjKV r 1 1 AivnK V r A i iv i v ni^ 

RARYTSEMRSELLGTDSAEPEMDVRKRTGVAGS 

QPVSEKQSAAELDLVLQRHQNLQEKLAEEMLGL 

ARSLKTira.AAQSVIKKDNQTLSHSLKMADQNL 

Eia.KtESERLEQHTQKSVNWLLWAMLIIVCFIFIS 

MILFIRIMPKLK 


3297 


A 


46 


617 


HKQPAGFLGLWLCj 1 b 1 I 1 lorrvjrrr.irvjA-»vjA-»oiAf\ 

tgipgspacrqpvvglhslhnyrmamvsamsw 

vlylwisacamllchgslqhtfqqhhlhrpegg 

tceviaahrccnknrieersqtvkcsclpgkvag 

ttrnrpscvdasivigkwwcemepclegeecktl 

pdnsgwmcatgnkikttrihprt 


3298 


A 


157 


748 


" T^T^rtT-NttPXTxx'ri A A vyPYAAT^FT PT VST FCSCFLAD 
IQPPDPK-NM 1 JUAA I isjiiSJVJLlSJC'ijJri-' v oi^r v^* j-r-r-kx^ 

PLNKSSYKYEADTVDLNWCVISDMEVIELNKCT 

SGQSFEVILKPPSFDGVPEFNASLPRRRDPSLEEIQ 

KKLEAAEERRKYQEAELLKHLAEKREHEREVIQ 

KAIEENNNFIKMAKEKLAQKMESNKENREAHLA 

AMLERLOEKDKHAEBVRKNKELKEEASR 


3299 


A 


5 


892 


■ -TQLPAPLSGVLSRLQLGSQAPLLTWVyJti 1 AU v A 
GGAPREUITPVTMWRLLARASAPLLRVPLSDSWA 
LLPASAGVKTLLPVPSFEDVSIPEKPKLRFIERAPL 
i/wA/ppBPirMl ^DTRGPSTEATEFTEGNFAILALG 
GGYLHWGHFElvmRLTINRSNdDPKNMFAIWRVP 
APFKPITRKSVGHRMGGGKGAIDHYVTPVKAGR 
LWEMGGRCEFEEVQGFLDQVAHKLPFAAKAVS 
RGTLEKMRKDQEERERNNQNPWTFERIATANML 
GIRKVLSPYDLTHKGKYWGKFYMPKRV 


3300 


A 


2 


1847 


FVAGGPRGSGSAAETMPEIRVITLGAUVUVUKS 
CILVSIAGKNVMLDCGMHMGFNDDRRFPDFSYI 
TQNGRLTDFLDCVnSHFHLDHCGALPYFSEMVG 
YDGPIYMTHPTQAICPILLEDYRKIAVDKKGEAN 
FFTSQMKDCMKKVVAVHLHOTVQVDDELEIKA 
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SEQID 
NO: 


Method 


Predicted 

begiDDiag 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine OCysteine, D^'Aspartic Acid, 
£=Glutamic Acid, F=PbenyIalanine, G=Glycinet H=Histidine, 
I=Isoleudne, K=LysiDe, L=Leudne, M=Mcthlonine, 
N==Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y*=Tyrosine, 
X=Unknown, ^'^^top codon, /^possible nudeotide deletfon, 
X^ossible nudeotide Insertion 










YYAGHVLGAAMFQIKVGSESWYTGDYNMTPD 

RHLGAAWIDKCRPNLLITESTYATTIRDSKRCRE 

RDFLKKVHETVERGGKVLIPVFALGRAQELCILL 

ETFWERMNLKVPIYFSTGLTEKANHYYKLFIPWT 

NQKIRKTFVQRNMFEFKHIKAFDRAFADNPGPM 

VWATPGMLHAGQSLQIFI^WAGNEKNMVIMP 

GYCVQGTVGHKILSGQRKLEMEGRQVLEVKMQ 

AKKMEFLKQKIEQELRVNCYMPANGETVTLPTS 
PSIPVGISLGLLJKREMAQGLLPEAKKPRLLHGTLI 
MKDSNFRLVSSEQALKELGLAEHQLRFTCRVHL 
HDTRKEOETALRVYSHLKSVLKDHCVOHLPDGS 
VTVESVLLQAAAPSEDPGTKVLLVSWTYQDEEL 
GSFLTSLLKKGLPQAPS 


3301 


A 


2 


349 


CIRTEPAAAPRRLGALSGAAALGFASYGAHGAQ 
FPDAYGKELFDKANKHHFLHSLALLGVPHCRKP 
LWAGLLLASGTTLFCTSFYYQALSGDPSIQTLAP 
AGGTLLLLGWLALAL 


3302 


A 


59 


1184 

. ..It .* • * 


LRRNCSALGGLFQTHSDMKGSYPVWEDFINKAG 
KLQSQLRTTVVAAAAFLDAFQKVADMATNTRG 
GTREIGSALTRMCMRHRSIEAKLRQFSSALIDCLI 
NPLQEQMEEWKKVANQLDKDHAKEYKKARQEI 
KKKSSDTLKLQKXAKKGRGDIQPQLDSALQDVN 

m^VT T T FFTVK'nA VRl<rAT TFFPnRFPTFT<sMT RP 

VffiEEISMLGEITHLQTISEDLKSLTMDPHKLPSSS 
. EQVILDLKGSDYS WSYQTPPSSPSTTMSRKSSVC 
SSLNSVNSSDSRSSGSHSHSPSSHYRYRSSNLAQQ 
APVRLSSVSSHDSGFISQDAFQSKSPSPMPPEAPN 
ORRKEKREPDPNGGGPTTASGPPA A AEE A ORPR ^ 
M 


3303 


A 


511 


958 


AGRGGPGKPVSWSSGPGSPGQTQRRSWVKSTRG 
HSSLLPPSQDFVAGLSVILRGTVDDRLNWAFNLY 
DLNKDGCITKEEMLDIMKSIYDMMGKYTYPAI R 
EEAPREHVESFFQKMDRNKDGWTIEEFIESCQK 
DENIMRSMQLFDNVI 


3304 


A 


40 


432 


ISEAASGAFQAR*FYQM\LEQKTDALGKQSVNRG 
FTKDKTLSSIFNIEMVKEKTAEEIKQIWQQYFAA 
KDTVYAVIPAEKFDLIWNRAOSCPTFLCALPRRE 
GYEFFVGQWTGTELHFHCTYKYSDPEGKA 


3305 


A 


2 


483 


LDACSTGPYSRSTHASADAWADAWVWVLKW 
GMTLFLLYFPQIFNKSNDGFn-rRSYGTVSQIFGS 
RSPSPNGFITTRSYGTVCPKDWEFYQARCFFLIHL 
♦VSSWNESWDFCKGKGCTLAIVDNSETLKLLHDL 
HDAEKNYIALPYRSSKYMSTCNGTF 


3306 


A 


2 


872 


TLSSACLIGDAWKELTIVAGAVSNQLLVWYPAT 

ALADNKPVAPDRRISGHVGIIFSMSYLESKGLLA 

TAJ5EDRSVRTWKGGDLR VPGGR VONIGHPFGHS 

ARVWQVKLLENYLISAGEDCVCLVWSHEGEILQ 

AFRGHQGRGIRAIAAHERQAWVITGGDDSGIRL * 

WHLVGRGYRGLG/DLGSLLQVP**ARYTQGCDS 

GWLLATAGSD*YRGPVSL*RRGQVLGAAARG*T 

FPVLLPAGGSSWSRGLRIVCYGQWGRSCQGCPH 

QHSNCCCGPDPVSWEGAQLELGPAWL 


3307 


A 


2 


927 


RTSRVEKGLRKAGAAVTMESDEWFSQALPANTS 
AQKAELIALTQAIRWGKDINVNTDSRYAFATVH 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C=Cysteine, D=Aspartic Acid, 
ir-r'in»aiiiir Arid F=PhenvlaIanine. G=Glvunei li^HistidiDe. 
I=Isoleuclnc, K=Lysinc, I^Leudnc, M^Metbionine, 
N"Asparagine» P^Proline, Q=Glatamine, R^Arginine, S^Serine, 
T=Threonine, V=«Valine, W=Tryptophaii, Y^Tyrosinc, 
X^Unknown, *-Stop codon, A>possible nudeotide deletion, 
V^ossible nudeotide insertion 










VRGAICQERRLLTSAEKAIKNKNPPSSKPNRSSS\F 

WGTTCDQVNAKQGPKPSPGHRLRRNLPGEKWEI 

DFTKVKPHQAGYKYLLVLVDTFSGWTEAFATK 

NETVNMVVKFLLNEIIPRHGLPVAIGSDNGPAFA 

LSIV*SVSKALNIQWKLHCAYRPQSSGQVERMNC 

TLKNTLTKLILETGVNWVSLLPLALLRVRCTPYW 

AGFLPFEIMYGRVLPILPKLRDAQLAKISQTNLLQ 

YLQSP 


3308 


A 


490 


1077 


NSPSLDFNDNEDIPTELSDSSDTHDEGEVQAFYE 

DLSGRQYVNEVFNFSVDKLYDLLFTNSPFQRDF 

MEQRRFSDHFHPWKKEENGNQSRVIPYTITLTNP 

LEHKTATVRETQTMYKASQESECYVmAEVLTH 

DWYHDYFYTINRYTLTRVARNKSRLRVSTELRY 

RKQPWGLVKTFIEKNFWSGLEDYFRHL 


3309 


A 


490 


1077 


NSPSLDFNDNEDIPTELSDSSDTHDEGEVQAl^ Yii 

DLSGRQYVNEVFNFSVDKLYDLLFTNSPFQRDF 

MEQRRFSDIIFHPWKKEENGNQSRVIPYTITLTNP 

I^HKTATVRETQTMYKASQESECYVIDAEVLTH 

DWYHDYFYTINRYTLTRVARNKSRLRVSTELRY 

RKQPWGLVKTFIEKNFWSGLEDYFRHL 


3310 


A 


2 


1198 


SPLCHPGLSRER/S*SEAKLRSGRYC*KRQVEAPL 
*RPGL*TMAASDTERDGLAPEKTSPDRDKKKEQS 
EVSVSPRASKHHYSRSRSRSRERKRKSDNEGRKH 
RSRSRSKEGRRHESKDKSSKKHKSEEHNDKEHSS 
DKGRERLNSSENGEDRHKRKERKSSRGRSHSRS 
RSRERRHRSRSRERKKSRSRSRERKKSRSRSRER 
KKSRSRSRERKRRIRSRSRSRSRHRHRTRSRSRTR 
' SRSRDRKKRIEKPRRFSRSLSRTPSPPPFRGRNTA 
MDAQEALARRLERAKKLQEQREKEMVEKQKQQ 
EL^AAAAATGGSVLNVAALLASGTQVTPQL\MA 
AQMAALQAKALAETGIAVPSYYNPAAVNPMKF 
AEOEKKRKMLWQGKKEGDKSQSAGNMGKN 


3311 


A 


111 


4 


PIQIPPRITPPRPSPHLLTPRTGSSPPPPRAPSPPHP T 
PGPAHDFPPLSAVLSGHTKT 


3312 


A • 


3 


426 


LESPRH*PPCWGPLIWALTVSSVPSPTPELSCILKS 
P/RPACPV/PGLWPSLLSPAPPQSSGPLLGLSPCPG 
AGQWPSPLSPAPPPSSDPLSGLSPCPGAGPRSSP\S 
ASAPCRAVPLSPRRLTWPPHLQVGILIPTGRPWK 

NL 


3313 


A 


162 


2 


QLQNLASRGCL*SQLLRRLRRENRLNPGGGGCSE 
IAP\CTPAWVTQRDFFRKKK 


3314 


A 


162 


2 


QLQNLASRGCL*SQLLRRLREENRLNPGGGGCSE 
L\P\CTPAWVTQRDFFRKKK 


3315 


A 


466 


1 


PRKRESWWGERLP/PRGFPPAAEDAPAPGWKGR 
KHASRTARAHVFHPIRQSIRSPVRGRPGDPRAAH 
TRSAGTRLQCKASRGG*GKGPAPTR*EGGPGSAP 
APLPASSGCSLFPDSSPWTPPPPAPGAAAAQP**T 
PRCPAALRAGAfflGRVGRPY 


3316 


A 


3 


2307 


NHLGTLMQNWDSSSRVPFSSGQHSTQSFPPbLMS 

KSNSMLQKJrl \AY VKr^MJJUVrrojyjLxirJsx»ooori i 

QSHGNSMTELKPSSKAHLTKLKIPSQPLDASASG 

DVSCVDEILKEMTHSWPPPLTAIHTPCKTEPSKFP 

FPTKESQQSNFGTGEQKRYNPSKTSNGHQSKSM 

LKDDLKLSSSEDSDGEQDCDKTMPRSTPGSNSEP 

SHHNSEGADNSRDDSSSHSGSESSSGSDSESESSS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Pbenylalamne, G=Glycine, H=Histidine, 
I'lsoleudne, K-Lysine, D=Leucine, M-Methionine, 
N"Asparagine, P»Proline, Q=Glutamine, R=Arglnine, S=^erine, 
T»Threonine, V=»Valine, W=Tryptophan, Y=^Tyrosinc, 
X»Unknown, *^tDp codon, /=posable nucleotide deletion, 
V=poS5ibIe nucleotide insertion 










SDSEANEPSQSASPEPEPPPTNKWQLDNWLNKV 

NPHKVSPASSVDSNIPSSQGYKKEGREQGTGNSY 

TDTSGPKETSSATPGR\APKPIQKGSESGRGRQKS 

PAQSDSTTQRRTVGKKQPKKAEKAAAEEPRGGL 

KIESETPVDLASSMPSSRHKAATKGSRKPNIKKES 

KSSPRPTAEKKKYKSTSKSSQKSREIIETDTSSSDS 

DESESLPPSSQTPKYPESNRTPVKPSSVEEEDSFFR 

QRMFSPMEEKELLSPLSEPDDRYPLIVKIDLNLLT 

RIPGKPYKETEPPKGEKKim^EKHTREAQKQASE 

KVSNKGKRKHKMEDDNRASESKKPKTEDKNSA 

GHKPSSNRESSKQSAAKEKDLLPSPAGPVPSKDP 

KTEHGSRKRTISQSSSLKSSSNSNKETSGSSKNSS 

STSKQKKTEGKTSSSSKEVKVKAPSSSSNCPPSAP 

TLDSSKPRRTKLVFDDRNYSADHYLQEAKKLKH 

NADALSDRFEKAVYYLDAWSFIECGNALEKNA 

QESKSPFPMYSETVDLI 


3317 


A 


496 


2 


NLLQDEKLVHSYPYDWRTQETCGYIVPARQWFI 

NXTRDIKTAAKELLKKVKFIPGSALNGNlVEMNro 

RRPYWCISRQRVWGVPIPVFHHKTKDEYLINSQT 

TEfflVKLVEQHGSDIWWTLPPEQLLPKEVLSEVG 

GPDALEYVPGQDILDIWFDSGTSWSYVLPGPD 


3318 


A 


2 


512 


AWHEGDSRSDQCHHPYNYGFDYYYGMPFTLVD 

SCWPDPSRNTELAFESQLWLCVQLVAIAILTLTF 

GKLSGWVSVPWLLIFSMILFIFLLGYAWFSSHTSP 

LYWDCLLMRGHEITEQPMKAE\RAGSIMVKEAIF 

LFRKGHSKGKLFLLFFLPFLQYHKTFPTTDGFHW 


3319 


A 


407 


1 


SSLHRSPRPASPLPVPEAPXSFLPVPAPKPSALPPFS 
LSGAPSSASTFSPHSSPSPASPTPAPSPQSPFPSRPT 
SPPSLTPTRRPPLPADRRGPHLLYQPLHAPLEAAA 
TGPE/PSAAAGRLPRPRPPWRAAYPASR 


3320 


A 


4037 


3432 


QMSEAVAEKMLQYRRDTAGWKICREGNGVSVS 
WRPSVEFPGNLYRGEGIVYGTLEEVWDCVKPAV 
GGLRVKWDENVTGFEIIQSITDILCVSRTSTPSAA 

MBCLISPRDFVDLVLVKRYEDGTISSNATHVEHPL 
CPPKPGFVRGFNHPCGCFCEPLPGEPTKTNLVTFF 
HTDLSGYLPQNVVDSFFPRSMTRFYANLQKAVK 


3321 


A 


37 


360 


SHSASGAGRPAAPAADLRPAPNGQRPGPRLGAR 
ALWLPPRGRPDEAGRLPGEHLPQVPWDPGLTRS 
PSPRGPCRGAARAGHVGETPAPWGCPPPCAWEH 
KGPGSEGTP 


3322 


A 


1 


420 


AIVEDKHSGRSYDITSDLGNVLTSTSIAKTVNG*A 

ESSDSGAESDEEDAQEDLMGAYHSDIDKKMMKI 

VADHKNLEVIVTNGYDKDGFVHDIQNDIHASSSL 

NGRSTVHVKPIDENLGQTGKSAVCIHQDINDDH 

VEDVT 


3323 


A 


8 


459 


DTLSLNCTLPETLPMTPSF*LSFL*FPGLARAKSIP 

TKTYSNEVVTLWYRPPDILLGSTDYSTQIDMW*G 

QVEVWQGPCGKGGGLVTTATQPAAFLFTVPSLP 

RGVGCIFYEMATGRPLFPGSTVEEQLHFIFRILSE 

EAWALCAVETHR 


3324 


A 


1276 


466 


PGSTHASARmY*L*IILSNATEVDNNFSKPPPFFP 
AGAPPASSSSSSSSSSPPTVSTAPPLIPPPGFPPPPG 
APPPSLIPTIESGHSSGYDSRSARAFPYGNVAFPH 
LPGSAPSWPSLVDTSKQWDYYARSSSSSSSSSSSS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amJoo 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
pepnae 
sequence 


Amino acid sequence (A«Alamne OCysteine, D=Aspartic Acid, 
Vsriiiitamir Add F=Phenvlalanine, CMslydne, HsHlstidine, 
I=Isoleudne, K=Lysinc, I/=LeuclDe, M»Metluonine, 
r^Asparagine,P^ProIinc, Q^Glutaminc, R»Arginine, S=Serine, 
T=Threonine, V=VaIinc W^Tryptophan, Y=Tyrosinc, 
X-UnknDWD, *=^top codon, A=possibIc nudeotide deJefion, 
\=possibIe nudeotide insertion 










SSSPRDRDRER*RTRERERERDHSPTPSVFNSDEE 
RYRYREYAERGYERHRASREKEERHRERRHREK 
EETRtlKSSRSNSRRRHESEEGDSHKRHKHKKSKR 
SKEGKEAGSEPAPEQESTEATPAE 


3325 


A 


266 


3312 


TCLFSASCSSLPSPSSSFALLSTENTQRTYRVNPD 

GSLRVTFASGMEIGLSSEPHILAGAVNPTLGKCNI 

SLPGEHNANLISVL**GEQGCA*NVFHISFS*AHN 

R^^.LSroFDHITRTGKIYDDHRKFTLRILYDQTGR 

PILWSPVSRYNEVNITYSPSGLVTFIQRGTWNEK 

MEYDQSEL*SPQL*LSnCYSAFVSFQSVMLLLHS 

QRRYIFEYDQPDCLLSVTMPSMVRHSLQTMLSV 

GYYRNIYTPPDSSTSFIQDYSRDGRLLQTLHLGTG 

RRVLYKYTKQARLSEVLYDTTQVTLTYEESSGD 

LSDSSTLIA*LLTVFVLVPAGPLIGRQIFRFSEEGL 

VNARFDYSYNNFRVTSMQAVINETPLPIDLYRYV 

DVSGRTEQFGKFSVINYDLNQVITTTVMKHTKIF 

SANGQVmVQYEILKAIAYWMTIQYDNVGRMVI 

CDIRVGVDANITRYFYEYDADGQLQTVSVNDKT 

QWRYSYDLNGNINLLSHGKSARLTPLRYDLRDRI 

TRLGEIQYKMDEDGFLRQRGNDIFEYNSNGLLQ 

KAYNKASGWTVQYYYDGLGRRVASKSSLGQHL 

QFFYADLTNPIRVTHLYNHTSSEITSLYYDLQGH 

LIAMELSSGEEYYVACDNTGTPLAVFSSRGQVIK 

EILYTPYGDIYHDTYPDFQVIIGFHGGLYDFLTKL 

VHLGQRDYDVVAGRWTTPNHHIWKQLNLLPKP 

FNLSTKLIKYGIFHFLFLILCLTDIRSWLELFGFQL 

rom.PGFPKPELENSPSI*QMSNSMLHLLCASLS* 

TILGf QCELQKQLimFISLDQLPMTPRYN^ 

GGKQPRFAAVPSVFGKGIKFAIKDGIVTADnGVA 

NEDSRRLAAILNNAHYLENLHFTIEGRDTHYFIK 

LGSLEEDLVLIGNTGGRRILENGVNVTVSQMTSV 

LNGRTRRFADIQLQHGALCFNIRYGTTVEEEKNH 

VLEIARQRAVAQAWTKEQRRLQEGEEGIRAWTE 

GEKOOLLSTGRVQGYDGYFVLSVEQ 


3326 


A 


290 


1041 


KACLHLLSSFLTSNFLFNPLLPDSLYSVEARSQRA 

NLGPCRRICRLQTLMRLAAGFQYSSHKDPSLSAK 

EKHTDYHNEARGPWPGWVG*RTADGSCGRGPD 

GAHHPGPKSSSWRASRLLPGLGGSHHLDAYVGR 

DLECGTPAPLQLEIPPQPRGHPAPIPTGQAGPRDS 

GPGASP*VETRPLTDGRR*PGVRPVGWTPAHPAG 

TLRPRGAVEPSVSACGKWAPSPTSQGCCEGRCD 

AVPKHRAWRTPLCSQ 


3327 


A 


1 


418 


CSECGKSFCKKSKFIIHQRTHTGEKPYECNQCGK 
SFCQKGTLTVHQRTHTGEKPYECNECGKNFYQK 
LHLIQHQRTHSGEKPYECSYCGKSFCQKTHLTQH 
QRTHSGERPYVCHDCGKTFSQKSALNDHQKIHT 

GVKLY 


3328 


A 


1 


270 


- ■ VTRKLPIFIVDAFTARAFRGSPAADCLLENELDED 
IVfflQKIAREMbn-SETAFIRKLHPTDNFA^ 


3329 


A 


45 


419 


EELSCWQIWQQIANDLTRCQDSMINNSQCHKQG 
DFPYQVGTELSIQISEDENYIVNKADGPNNTGNP 
EFPILRTQDSWRKTFLTESQRLNRDQQISIKNKLC 
OCKKGVDPIGWISHHDGHRVHKR 


3330 


A 


64 


430 


FWRMTGLAPAAAVATTTSSSTMRFTSISNSLT^ 
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SEQID 
NO: 


Metfaod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A«AIanine C==Cysteine, D^^Aspartic Acid, 
E=Glutamic Acid, F=PhcnyIaIanine, G=Glycine, H^Histldine, 
I-Isoleudoe, K«Lysine, L^Leudne, M==Methionine, 
N^Asparagine, P=ProIine, QNi^lutamine, R^Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryplophan, Y«Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=pos5ible nudeotide insertion 










AAIGLSFTTSTTTTATFTTNTTTTITSGFTVNQNQ 
LI^RGFENLWYTSTVSVVTTPVMTYGHLEGLIN 
EGNLELEIKRRLSSQATQ 


3331 


A 


3 


407 


TFGCSCTDCFFQKCCPAEAGVLLAYNKNQQIKIP 
PGTPIYECNSRCQCGPDCPNRIVQKGTQYSLCIFR 
TSNGRGWGVKTLVKIKRMSFVMEYVGEVITSEE 
AERRGQFYDNKGITYLFDLDYESDEFTVDAARY 


3332 


A 


25 


461 


PAADFVLQARPTRADILGIHSKYDEVRKAGACFY 

KMTGLGPGPOALYNGEPFKHEEMNIKELKMAVL 

QRMMDASVYLQREVFLGTL>roRTNAIDFLMDR 

NNWPRINTLILRTNQQYLNLLSTSVTADAEDFS 

TFFFLDSQDKSA 


3333 


A 


317 


54 


AWIIFLPPLTSCPLWAPGTKHKTILEARSGLGPIK 
AYPRLGPPTPGEPEAPAQDRTFHCEICNVKVNSK 
VQLKQfflSSRRHEIVDPV 


3334 , 


A 


304 


410 


AGPSLPSNLRQIFQSLPPFMDILLLLLFFMIIFAI 


3335 


A 


19 


418 


VESRNSRVQPRVRLNDRTOAIDFLMDRNNWPRI 
NTLILRTNQQYLNLISTSVTADVEDFSTFFFLDSQ 
DKSAVXAKMvlYYLTQDDESnSAATLWIIADFDK 
PSGRKLLFNALKHMITSVHSRVGIIYNPFF 


3336 

- 


A 

. ... ..... 


1 


1003 


PSSYSSDELSPGEPLTSPPWAPLGAPERPEHLLNR 
VLERLAGGATRDSAASDILLDDIVLTHSLFLPTEK 
FLQELHQYFVRAGGMEGPEGLGRKQACLAMLL 

QGLREDTLRLHQLVETVELKIPEENQPPSKQVKP 
. LFRHFRRIDSCLQTRVAFRGSDEIFCRV YMPDHS 
:Y>;rnRSRLSASVQDILGSVTEKLQYSEEPAGREDS. 
LILVAVSSSGEKVLLOPTEDCVFTALGINSHLFAC 
TRDSYEALVPLPEEIQVSPGDTEIHRVEPEDVANH 
LTAFHWELFRCVHELEFVDYVFHGE 


3337 


A 


444 


43 


KILLCLANQFPDISFCPALPAWALLLHYSIDEAE 
CFEKACRILACITOPGRRLIDOSFLAFESSCMTFGD 
LVNKYCQAAHKLMVAVSEDVLQVYADWQRWL 
FGELPLCYFARVFDVFLVEGYKVLYRVALAXXF 


3338 


A 


1 


398 


FRGKVRGRSAEMPGSDTALTVDRTYSDPGRHHR 
CKSRVERHDIVINTLSLPLNIRRGGSDTNLNFDVPD 
GILDFHKVKLTADSLKQKILKVTEQIKIEQTSRDG 
NVAEYLKLVNNADKQQAGRIKQVFEKKNQK 


3339 


A 


1 


665 


AAAASNWGLIThTEWSIVGVSVLTMPFCFKOCGI 

VLGALLLVFCSWMTHQSCMFLVKSASLSKRRTY 

AGLAFHAYGKAGKMLVETSMIGLMLGTCIAFYV 

VIGDLGSNFFARLFGFQVGGTFRMFLLFAVSLCI 

VLPLSLQKNMMASIQSFSAMALLFYTNnFMFVIVL 

SSLKHGLFSGQWLRRVSYVRWEGVFRCIPIFGMS 

FACQSQVLPTYDSLDEPSV 


3340 


A 


198 


367 


LLPLQVLQEAFSRCVAVLTRSSKPSDMSVQVCG 
YISKCYSVAAQFEBCREKITEMP 


3341 




562 


277 


HSVIKRTPRKYIJVEIVLIDDFSNKEHLKEKLDEYI 
KLWNGLVKWKNERREGLIQARSIGAQKAKLGQ 
VLIYLDAHCEVAVNWYAPLVAPISKDR 


3342 


A 


385 


2 


NLTWWPLFRDVSFYIVDLIMLIIFFLDNVIMWWE 
SLLLLTAYFCYVWMKFNVQVEKWVKQMINRN 
KWKVTAPEAQAKPSAARDKDEPTLPAKPRLQR 
GGSSASLHNSLMRNSIFQNKIHTLDPHV 


3343 


A 


1 


385 


FRVDNSHEWKDVFnSSERSFKLDSLKCGTWYKV 
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Amino acid sequence (A«Alanine OCysteine, D=Aspartic Acid, 
E>=Glutamic Acid, F=PhcnyIalaninc, G^Glycine, H=Histidinc, 
I^Isoleudne, K>»Lysine, L=Lcucine, M=Metliionine, 
N=Asparagine, P*=Proline, Q=GIutamine, R=Argimnc, S=Serine, 
T=Thrconine, V=Valine, W=Tryptophan, Y«Tyrosinc, 
X^UnlLiiowD, *^top codon, /=possibie nucleotide deletion, 
V^possible nucleotide insertion 



S£QID 
NO: 



Method 



3344 



3345 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
add residue of 
peptide 

sequence 



Predicted end 
nudeotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 



351 



351 



147 



147 



KLAAKNSVGSGRISEnEAKTHGREPSFSKDQHLF 
THmSTHARL^JLQGWNNGGCPITAIVLEYRPKGT 
WAWQGLRANSSGEVFLTELREATWY 



SPACITSSLSQHIADPRAAPTEVKVRVMNSTAISL 
QWNRVYSDTVQGQLREYRVRKPAPDSPNYPAH 



SPACITSSLSQHIADPRAAPTEVKVRVMNSTAISL 
QWNRVYSDTVQGQLREYRVRKPAPDSPNYPAH 



3346 



1509 AGIRHEAPPTTSNRHRRQIDRGVTHLNISGLKMP 
RGIAIDWVAGNVYWTDSGRDVIEVAQMKGENR 
KTLISGMIDEPHAIVVDPLRGTMYWSDWGNHPK 
ffiTAAMDGTLRETLVQDNIQWPTGLAVDYHNER 
LYWADAKLSVIGSIRLNGTDPIVAADSKRGLSHP 
FSroWEDYIYGVTYINNRVFKJHKFGHSPL\n^n. 
GGLSHASDVVLYHQHKQPEVTNPCDRKKCEWL 
CLLSPSGPVCTCPNGKRLDNGTCVPVPSPTPPPD 
APRPGTCNLQCFNGGSCFLNARRQPKCRCQPRY 
TGDKCELDQCWEHCRNGGTCAASPSGMPTCRCP 
TGFTGPKCTQQVCAGYCANNSTCTVNQGNQPQ 
CRCLPGFLGDRCQYRQCSGYCENFGTCQMAAD 
GSRQCRCTAYFEGSRCEVNKCSRCLEGACWNK 
QSGDVTCNCTDGRVAPSCLTCVGHCSNGGSCTM 
NSKMMPECQCPPHMTGPRCEEHVFSQQQPGHIA 
SILIP 



3347 



974 



666 



3348 



1171 



3349 



403 



497 



3350 



712 



3351 



428 



SPEMESHPITQAGVQWHHLSSLQPLPPGFK*FSCF 
SLPE*LGYRHVPPCLANSVFSVEMG\FLHVGQAG 
LELLTSGDLPALASQSAGITGVSHRARPENGFENIF 



LSKITMPVICNEPLSFIQRLTEYM*HTYFIHRPSSL 
SDPVDRMQCVAAFAVSAVASQWERTGKPFNPLL 
GETYELVRDDLGFRLISEQVSHHPPISAFHAEGLN 
NDFIFHGSIYPKLKFWGKSVEAEPKGTITLELLEH 
NEAYTWTNPTCCVHNirVGKLWmQYGNYEIINH 
KTGDKCVLNFKPCGLFGKELHKVEGYIQDKSKK 
KLCALYGKWTECLYSVDPATFDAYKKNDKKNT 
EEKKNSKQMSTSEELDEMPVPDSESVFIIPGSVLL 
WRIAPRPPNSAQMYNFTSFAMVLNEVDKDMESV 
IPKTDCRLRPDIRAMENGEIDQASEEKKRLEEKQ 
RAARKNRSKSEEDWKTRWFHQGPNPYNGAQD 
WIYSGSYWDKISfYFNLPDIY 



NFASSSGKYLRTQKIKCLNNKFTPFPTTEKK*SQS 
VRPP*SNR1Y*ILQS*NISFS*LPN*NFASSSGKYLR 
TQKIKCLNNBJ^TPFPTTEKK 



GAPAQDCICLPFPFHSSFLESDIRKPARRKIQTTNP 
DFLLLLFMSVPWSAPPFCPPAEGSRDGRPKASV 
ARPAAVHEHHSPRDCGHLPDVIRSSLGGWQPH*P 
AQPENRLL*LLPVE*GHQHPTYSPVP*AGSPGGAS 
GWPGPGQAWRVRVPGPHPLCPPASPPSPVQQ**E 
SVAAGSGLPGCVLCAAGRRPGPLPLLCVEVGQA 
LPPGAWVSSSGQRPGLTHPLAYSHGCVPSEG 



MAAVVAATALKGRGARNARVLRGBLAGATANK 
ASHNRTRALQSHSSPEGKEEPEPLSPELEYIPRKR 
GKNPMKAVGLAWAIGFPCGILLFILTKREVDKDR 
VKQMKARQNMRLSNTGEYESQRFRASSQSAPSP 
DVGSGVQT 



3352 



841 



RTLFRGRRRREDDRISRPHPSTAESKAPTPKl^'DLL 
ASNFPPLPGSSSRMPGELVLENRMSDWKGVYK 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nocleotide 
location 
corresponding ' 
to first amino 
add residue of 
peptide 
seqoence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanlne C=Cysteine, D=Aspartic Acid, 
£=G]utamic Acid, F=Plienylalanine, G=Glycine, H^Histidine, 
I=Isoleudne, K^Lysine, I^Leudne, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y==Tyrosine, 
X^UnknowD, *'=Stop codon, A=possible nudeotidc ddetion, 
>=po$sibIe nucleotide insertion 










EKDNEELTISCPVPADEQTECTSAQQLNMSTSSP 

CAAELTALSTTQQEKDLIEDSSVQKDGLNQTTIP 

VSPPSTTKPSRASTASPCNNNINAATAVALQEPR 

KLSYAEVCQKPPKEPSSVLVQPLRELRSNVVSPT 

KNEDNGAPENSVEKPHEKPEARASKDYSGFRGN 

IffRGAAGKIREQRRQFSHRAIPQGVTRRNGKEQ 

YVPPRSPK 


3353 


A 


1054 


587 


lATPTWTAPLTATPTPAHQYGPARVPNGAPRLEP 

PPGKRECRVGQYVVDLTSFEQLALPVLRNADCS 

SGPGQRVCVIDEIGKMELFSQLFIQAVRQTLSTPG 

TIILGTIPVPKGKPLALVEEIR]^RKI)VKVFNV^ 

NRNHLLPDIVTCVQSSRK 


3354 


A 


56 


1268 


GMEPVGCCGECRGSSVDPRSTFVLSNLAEWER 

VLTFLPAKALLRVACVCRLWRECVRRVLRTHRS 

VTWISAGLAEAGHLEGHCLWWAEEIJENVIUI^ 

HTVLYMADSETFISLEECRGHKRARKRTSMETA 

LALEKLFPKQCQVLGIVTPGIWTPMGSGSNRPQ 

EIEIGESGFALLFPOIEGIKIOPFHFIKDPKNLTLER 

HQLTEVGLLDNPELRWLVFGYNCCKVGASNYL 

QQWSTFSDMNIILAGGQVDNLSSLTSEKNPLDI 

DASGVVGLSFSGHRIQSATVLLNEDVSDEKTAEA 

AMQRLKAANIPEHNTIGFMFACVGRGFQYYRAK 

GNVEADAFRKFFPSVPLFGFFGNGEIGCDRIVTG 

NFILRKCNEVKDDDLFHSYTTIMALIHLGSSK 


3355 


A 


1 


707 


GTSSGLGGDRLAAPGPSPPSFYPOGRGERAYDTY 
SRLLRERIVCVMGPIDqSVASLVIAQLLFLQSESN 
.B^IHMYIN^SFGGy 

WCVGQAASMGSLLLAAGTPGMRHSLPNSRIMIH 
QPSGGARGQATDIAIQAEEMKLKXQLYNIYAKH 
TKQSLQVIESAMERDRYMSPMEAQEFGILDKVL 
VHPPQDGEDEPTLVQKEPVEAAPAAEPVPAST 


3356 


A 


352 


338 


FNYNFCRNLHMPSFLV*PGMCGLLAKHLSFHIVG 
AFLIT/LGVAALCKFAVA*PRKKAYADFYRNYN* 
IKEFEVRKANISQSTK 


3357 


A 


1 


403 


ALGSCGGLLGTGLLKGTMSGTLWSKGIFAGYKR 
RIRIQREHTAVLKIEGWYARDETEFYLRMICANY 
YKANNNTVTPVLTPDKTOVMWRKVTQAHGISI 
MVRAQFRimPADAIGHRIRMML*PSRMYTTEPS 


3358 


A 


71 


2897 


FCSKDKCCLYLPDSINRSKSCTAKPGAHSQDRHA 

VMDSERQVKDTDDDESPKRSIRDSGYIDCWDSER 

SDSLSPPRHGRDDSFDSLDSFGSRSRQTPSPD\A^ 

RGSSDGRGSDSESDLPHRK1.PDVKKDDMSARRT 

SHGEPKSAVPFNQYLPNKSNQTAYVPAPLRKKK 

AEREEYRKSWSTATSPAGLGKKALQDYGPRT\PV 

S\DDAESTSMFDMRCEEEAAVQPHSRARQEQLQ 

LINNQLREEDDKWQDDLARWKSRKRSVSQDLIK 

KEEERKKMEKLLAGEDGTSERRKSKTYREIVQE 

KERRERELHEAYKNARSQEEAEGILQQYIERFTIS 

EAVLERLEMPKILERSHSTEPNLSSFLNDPNPMK 

YLRQQSU^PPKFTATVETTIARASVLDTSMSAGS 

GSPSKTVTPKAVPMLTPKPYSQPKNSQDVLKTFK 

VDGKVSVNGETVHREEEKERECPTVAPAHSLTK 

SQMFEGVARVHGSPLELKQDNGSIEINIKKPNSV 

PQELAATTEKTEPNSQEDKNDGGKSRKGNEELAS 

SEPQHFTTTVTRCSPTVAFVEFPSSPQLKNDVSEE 



310 



PCT/USOl/04098 

cine, B^Aspartic Acid, 
Ivcine. H^Histidlne. 



S£QID 
NO: 


Method 


Predicted 

nncleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

llUUCUUUw 

location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«Alanine OCysteine, B^Aspartic Add, 
Er=Glutamic Add, F«Phenylalanine, G=GIycine, H^Histidlne, 
]»IsoIeadne, K^Lysine, I^Leucine, M^Methioninc, 
N^Asparagine, P=ProHnc, Q^GIutamine, R^Arginine, S=Scrinc, 
T=Threonine, V=Valine, W=Tryptophan, Y«Tyrosinc, 
X=Unknown, *=Stop codon, /possible nudeotide deletion, 
V=possible nucleotide insertion 










KIXJKKPENEMSGKVELVLSQKVVKPKSPEPEAT 

LTFPFLDKMPEANQLHLPNLNSQVDSPSSEKSPV 

TTPFKFWAWDPBEERRRQEKWQQEQERLLQER 

YQ\KEQDK\LKEE\WEKAQKEVEEEBRRYYEEEP* 

mEDPVWFWSSSSADQLSTSSSMTEGSGTMNKI 

DLGNCQDEKQDRRWKKSFQGDDSDLLLKTRES 

DRLEEKGSLTEGALAHSGNPVSKGVHEDHQLDT 

EAGAPHCGTNPQLAQDPSQNQQTSNPTHSSEDV 

KPKTLPLDKSINHQIESPSERRKSISGKKLCSSCGL 

PLGKGAAMIIETLNLYFfflQCFRCGMCKGQLGDA 

VSGTDVRnmGLLNCl^CYMRSRSAGQPTTL 


3359 


A 


3 


368 


EVTASREGRGACAWECGSSRGPWGLLRGTFAPV 
RAATP*S*LPKGSLRHRP*/CPPPVHLPPKSSCPPR • 
AWAGRATSM*TSSYSSEYQPQTP*ALVTLPPRSY 
YLLTHLLTLTHLHHQILFEP 


3360 


A 


2 


392 


ARGIGSLGRDHSGSGGGTGMAGAWVRKAADYV 
RSKDFRDYLMSTHFWGPVANWGLPIAAITDMKV 
KSPEnSRRMTFAL*CYSLTFVRFAHYVQ\PWNWL 
MLGCHTAVDFDQLISSMPCISHGMTASASAL 


3361 


A 


4619 


532 


LLLGRANSPPYNSVVRH-PPATLLLRRAGWESF 
WSCQSRSPWPPRPEVRAPAKGPRGVAGAAGACS 
AGARLGDAAGGDPASGQAARGCGARAPRGLGR 
TARARDTAMEDAGAAGPGPEPEPEPEPEPEPAPE 
PEPEPKPGAGTSEAFSRLWTDVMGILDGSLGNID 
DLAQQYADYYNTCFSDVCERMEELRKRRVSQD 
LEVEKPDASPTSLQLRSQIEESLGFCSAVSTPEVE 
• RKhW'LHKSNSEDSSVGKGDWKKKNKYFWQNFR 
KNQKGIMRQTSKGEDVGYVASEimSDEERIQL 
MMMVKEKMITIEEALARLKEYEAQflRQSAALDP 
ADWPDGSYPTFDGSSNCNSREQSDDETEESVKF 
KRLHKLVNSTRRVRKKLIRVEEMKKP\STEGGEE 
HVFENSPVLDERSALYSGVHKKPLFFDGSPEKPP 
EDDSDSLTTSPSSSSLDTWGAGRKLVKTFSKGES 
RGLIKPPKKMGTFFSYPEEEKAQKVSRSLTEGEM 
KKGLGSLSHGRTCSFGGFDLTNRSLHVGSNNSDP 
MGKEGDFVYKEVIKSPTASRISLGKXVKSVKET 
MRKRMSKKYSSSVSEQDSGLDGMPGSPPPSQPD 
PEHLDKPKLKAGGSVESLRSSLSGQSSMSGQTVS 
TTDSSTSNRESVKSEDGDDEEPPYRGPFCGRARV 
HTDFTPSPYDTDSLKLKKGDHDIISKPPMGTWMG 
LLNNKVGTFNFIYVDVLSEDNEEKPKRPTRRRRK 
GRPPQPKSVEDLLDRINLKEHMPTFLFNGYEDLD 
TFKLLEEEDLDELNIRDPEHRADLLTAVELLQEY 
DSNSDQSGSQEKLLVDSQGLSGCSPRDS*CYESS 
ENLENGKTRKASLLSAKSSTEPSLKAFSRNQLGN 
YPTLPLMKSGDALKQGQEEGRLGGGLAP\DTSKS 
CDPPGC*LVL>AKmRKPPSFPSCRSC\ETL\EGPQ 
TVDTWPRSHSLDDLQVEPGAEQDVPTEVTEPPPQ 
IVPEVPQKTTASSTKAQPLEQDSAVDNALLLTQS 

KRF SEr vJsJU 1 i JsJsJvCOaJLAAoOJtvOJL^arx^ i 

DAQPPGAKHGLARTPLEGHRKGHEFEGTHHPLG 

TKEGVDAEQRMQPKIPSQPPPVPAKKSRERLANG 

LHPVPMGPSGALPSPDAPCLPVKRGSPASPTSPSD 

CPPALAPRPLSGQALGSPPSTRPPPWLSELPENTS 

LOEHGVKLGPALTR\KVSCARGVDLETLTENKL\ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
seqnence 


Amino acid sequence (A-Alanine C=Cysteine, D=Aspartic Add, 
E=Glutamic Add, F=Fhenylalaaine, G=Glycine, H^'Histidine, 
t=X$oleudne, K=Lysine, Lr^Lcudne, M=Methionine, 
N^'Asparaginc, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T=Threomnc V«Valine, W=Tryptophan, Y»Tyrosine, 
X=Unknown, *»Stop codon, /^possible nudeotide deletion, 
\=^ssible nucleotide insertion 










HAEGIRSSRREPYS*LRHGRCGI\P\EALVQRYAED 
LDQPERDVAANMDQIRVKQLRKQHRMAIPSGGL 
TEICRKPVSFGCISVSVSDWLISIGLFMYAGTLSTA 
GFSTL\SQVPSLSHTCLQEAG\ITEERHIRK\LLSAA 
RLFKLPPGPEAM 


3362 


A 


1 


4653 


FRGGVGYAHTLHLLPFAGSSVVLARARRTDRWT 

SGLVEMATLSLTVNSGDPPLGALLAVEHVKDDV 

SISVEEGKENILHVSENVIFTDVNSILRYLARVAT 

TAGLYGSNLMEHTEBDHWLEFSATKLSSCDSFTS 

TINELNHCLSLRTYLVGNSLSLADLCVWATLKG 

NAAWQEQLKQKKAPVHVKRWFGFLEAQQAFQS 

VGTKWDVSTTKARVAPEKKQDVGKFVELPGAE 

MGKVTVRFPPEASGYLHIGHAKAALLNQHYQV 

OTKGKLIMRFDDTNPEKEKEDFEKVILEDVAML 

HIKPDQFIYTSDHFETIMKYAEKLIQEGKAYVDD 

TPGEQKAEREQRffiSKHRKNPmKNLQMWEEMK 

kgsqfghscclrakidmssnngcmrdptlyrck 

iqphprtgn*y\nv\yptydfacpivdsiegvthal 

rtteyhdrdeqfywiiealgirkpyiweysrlnl 

nntvlskrkltwfvneglvdgwddprfptvrg 

vlrrgmtvbglkqfiaaqgssrsvvnmewdki 

wafmckvidpvapryvallkkevipvnvpeaqe 

emkevakhpknpevglkpvwyspkvfiegadae 

tfsegemvtfinwgnlnitkihknadgknsldak 

ll^enkdykkttkvtwlaetthalpipvicvty^ 

,hlitkpvlgkdedfkqyvnknskheelmlgdpc . 

.licdlkkgdiiqlqrrgfficdqpyepvspysckea" 

pc\nliyipdghtkemptsgsbgektkveatknets 

apfkerptpslnnncttsedslvlynrvavqgd 

vvrelkakkapkedvdaavkqllslkaeykek 

tgqeybcpgnppaeigqnissnssasileskslyde 

vaaqgevvrklkaekspkakineavbcllslica 

qykektgkeyipgqpplsqssdssptrnsepagle 

tpeakvlfdkvasqgewrklktekapkdqvdi 

avqellqlkaqyksligveykpvsatgaedkdk 

kkkekel^sekqnkpqkqndgqrkdpsknqgg 

glsssgagegqgpkkqtrlgleakk\eenladw 

ysqvitkselvfleyhdisgcyilrpwayaiweaikd 

ffdaedcklgvencyfpmfvsqsalekekthva 

dfapevawvmsgktelaepiairptsetvmypa 

yakwvqshrdlpiklnqwcnwrwefkhpqpf 

lrtreflwqeghsafAtmeeaaeevlqildlya 

qvyeellaipwkgrktekekfaggdytttieaf 

isasgraiqggtshhlgqnfski^ivfedpkipg 

ekqfayqnswglttrtigvmtmvhgdnmglvl 

pprvacvqvviipcgitnalseedkealiakcndy 

rrrllsvnirvradlrdnyspgwkfnhwelkg 

vpirlevgprdmkscqfvavrrdtgekltvaen 

eaetklqailediqvtlfi^sedlkthmvvant 

medfqkildsgkivqipfcgeidcedwikkttard 

qdlepgapsmgakslcipfkplcelqpgakcvcg 

knpakyytlfgrsy 


3363 


A 


3797 


1514 


lggaapetmpfpvttqgsqqtqppqkhygitspis 

laapketdcvltqk\li\etlkpfggflkkjeegta 

srrnfnfgkn*inl\^wirrnq»kaknlpqsvi\ 
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1 SEQID A 
NO: 


Method t 
I 
■ 
1 
c 
i 
i 


Predicted 1 Predicted end ^ 
beginning nucleotide 1 
lucleotide location 1 
ocation corresponding P 
corresponding to last amino ' 
0 first amino acid residue of 3 
icid residue of peptide ^ 
peptide sequence 
lequence 1 


:mo acid sequence (A=Alanlne C=t;ysteine. D Asparoc |jx.q, 
>=Glutainic Acid, IfePhenylalarane, G=Giycuie, H=Histidiiie, 
=Isoleucine,K=Lysine,I>=Leucine,M=Metliionine, 
<=Asnaragine, P=Proline, Q=Glutamine, R=ArgiiiIne, S-Serine, 
r=Tlireoiiine, V=Valine, W=TryptopliaD, Y=Tyrosme, 
t=Uiil{nown, *=Stop codon, /=possi We nucleotide deletion, 
^possible Ducleotide losertioB 

ENV\GGKIFr/FLGSYRL/GliVHTKGADlDOVUVf 

APRHVDRSDFFnSFYDKLKLQEEVKDLRAVEEA 

FVPVKLCFDGIEIDILFARLALQTIPEDLDLRDDS 

LLKNLDIRCIRSLNGCRVTDEILHLVPNIDNFRLT 

LRAKLWAKRHNIYSNILGFLGGVSWAMLVART 

CQLYPNAIASTLVHKFFLWSKWEWPNPVLLKQP 

EECNLNLPVWDPRVNPSDRYHLMPniPAYPQQN 

STYNVSVSTRMVMVEEFKQGLAITDEILLSKAE 

WSKLFEAPNFFQKYKHYIVLLASAPTENQRLEW 

VGLVESKIRILVGSLEKNEFITLAHVNPQSFPAPK 

ENPDKEEFRTMWVIGLVFKKTENSENLSVDLTY 

DIOSFTDTVYRQAINSKMFEVDMKIAAMHVKRK 

QLHQLLPNHVLQKKKKHSTIGVKLTALNDSSLD 

LSMDSDNSMSVPSPTSATKTSPLNSSGSSQGKNS • 

PAPAVTAASVTNIQATEVSVPQVNSSESSGGTSSE 

SIPOTATQPAISPPPKPTVSRVVSSTRLVNPPPRSS 

GNAATSGNAATKIPTPIVGVKRTSSPHKEESPKK 

TKTEEDETSEDANCLALSGHDKTEAKEQLDTETS 

TTQSETIQTAASLLASQKTSSTOLSDIPALPANPIP 

VTKNSIKLRLNR 








3364 


A 


54 2 

t ' 1 


1073 


■ SARTMSYDYHQNWGRDGGPRSSGGGYOOUi'AU 
GHGGNRGSGGGGGGGGGGRG/WQGPASRAPER 
PRNRHWREKTGAEEQAVKRRGKREL/LVHMDE 
RREEQIVQLLNSVQAKNDKESEAQISWFAPEDHG 
YGTEVSTKNTPCSENKLDIQEKKLINQBKKMFRI 
RNRSYIDRDSEYLLQENEPDGTLDQKLLEDLQKK 
KNDLRYIEMQHFlREKLPSYGMQKELVNLIDNHQ . 
VTVISGETGCGKTTQVTQFILDNYIERGKGSACRI 
VCTOPRRISAISVAERVAAERAESCGSGNSTGYQI 
RLOSRLPRKQGSILYCTTGIILQWLQSDPYLSSVS 

htvIdeihernlqsdvlmtwkdllnfrsdlkvi 

LMSAILNAEKFSEYFGNCPMIHIPGFTFPWEYLL 

EDVffiKIRYVPEQKEHRCQFKRGFMQGHVNSQE 

KEEKEAIYKERWPDYVRELRRRYSASTVDVIEM 

MEDDKVDLNLIVALIRYIVLEEEDGAILVFLPGW 

DNISTLHDLLMSQVMFKSDKFLIIPLHSLMPTVN 

QTQVFKRTPPGVRKIVIATNIAETSITIDDWYVID 

GGKIKETHFDTQNNISTMSAEWVSKANAKQRKG 

RAG\RVQPGSLLFICINGS*EASLLGWnQLPEIF/R 

GTPLEELCLQIKVLRLGGI/GLFLSRLMDPPSNEA 

VLLSIRQL\RSLNALDKQEELTPLGVHLARLPVEP 

fflGKME-FGALFCCLDPVLTIAASLSFKDPFVIPLG 

KEKIADARRKELAKDTRSDHLTVVNAFEGWEEA 

RRRGFRYEKDYCWEYFLSSNTLQMLHNMKGQF 

AEHLLGAQFVSSRNPKDPESNINSDNEKIIBCAVIC 

AGLYPKVAKIRLNLGKKRKMVKVYTKTDGLVA 

VHPKSVNVEQTDFHYironLIYHLKMRTSSIYLYD 

CTEVSPYCLLFFGGDISIQKDNDQETIAVDEWIVF 

OSPARIAHLVKRAWHMDERREEQIVQLLNSVQ 

A irxTr>ii'R<!F ADTS WF APEDHG YDKKYFFKE 


3365 


A 


439 


878 


ECCNVRPLRETDLLKMKRKPRASSPVVlitiyi'KA 
NTKETRKKKSFSQPMSASTKEESQDGRRKGK*L 
KGRARKKNAPQKSMALRILEEGSRPTPSGHSDQL 
NEEL*QNELQLEQ/PEGT*LEQQSEGTQPEQQSGR 

MPnSTLSLSSE 
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SEQID 
NO: 


Method 


Predicted 

begiDning 

nucleotide 

location 

correspondiog 

to first amino 

acid residue of 

peptide 

sequence 


Predicted eod 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D^'Aspartic Add, 
£=Glutamic Acid, F-Phenylalanine, G=G!ycine, H-Histidine, 
I^lsoleucine, K-Lysine, I;=Lcucine, M^Methlonine, 
N=Asparagine, P==ProIine, Q=Glotamine, R-Arginine, S^erine, 
T=Tbrconine, V=VaUnc, W=TryptDplian, Y=Tyrosinc, 
X^Unknown, *^top codon, /^possible nucleotide deletion, 
V=pos5ible nucleotide insertion 


3366 


A 


1 


827 


FRGYWGVREAFTDASWSGGLGPGKPGMKITRQ 
KHAKKHLGFFR^INFGVREPYQILLDGTFCQAAL 

KDLYGAKLIAQKCQVRNCPHFKNAVSGSECLLS 

MVEEGOTHHYFVATQDQNLSVKVKKKPGVPLM 

FnQNTMVLDKPSPKTIAFVKAVESG\RLSQCMR^ 

KVSMSKRNRV**KTLNRGRRKKRKKISGPNPLS 

CLKKKKKAPDTQSSASEKKIOCiaailRhra^^ 

LSEKQNAEGE 


3367 


A 


40 


1467 


MLWGCRAKACWGPRLSDLVASLSPQRECISVHV 

GQAGVQIGNACWELFCLEHGIQADGTFDAQASK 

IhODDDSFTTFFSETGNGKHVPRAVMIDLEPTVVD 

EVRAGTYRQLFHPEQLITGKEDAANNYARGHYT 

VGKESIDLVLDRIRKLTDACSGLQGFLIFHSFGGG 

TGSGFTSLLMERLSLDYGKKSKLEFAIYPAPQVS 

TAVVEPYNSILTTHTTLEHSDCAFMVDNEAIYDI 

CRRNLDIERPTYTNLNRLISQIVSSITASLRFDGAL 

NVDLTEFQTNLVrYrKJLHrrLV 1 YAFUc)AbJsj\ YH 

EQLSVAEITSSCFEPNSQMVKCDPRHGKYMACC 

MLYRGDWPKDVNVAIAAIKTKRTIQFVDWCPT 

GFKVGINYQPPTVVPGGDLAKVQRAVCMLSNTT 

AIAEAWARLDHKFDLMYAKRAFVHWYVGEGM 

EEGEFS*RPGEDLA\ALE\KDYEEVGTDSFEEENE . 

GEEF 


3368 


A 


3 


2597 


SLLBETMDEDSSLREYTVSLDSDMDDASKCLQE 

YDSGTGNraEALRPGPRTVSTKAQPGRSASSSSG - - 

DKTTSFAEQKlfek£SHTO ' 

LNIPHAGAWAQIPfiETGLPQGRDTTQLLASEMV 

HLMMK\LKEKR\RAI*AQKKKMEAAFTKQRQm 

GRTAFLTWKKKGDGISPLREEAAGAEDEKVYT 

DRAKEKESQKTDGQRSKSLADDCESMENPQAKW 

LKSPTTPIDPEKQGNLASPSEETLNBGEILEYTKSI 

EKLNSSLHFLQQEMQRLSLQQEMLMQMREQQS 

WVISPPQPSPQKQIRDFKPSKQAGLSSAIAPFSSD\ 

SPRVPTHPSSTSLLNRKSASFSVKSQRTPRPNELKJ 

TPLNRTLTPPRSVDSLPRLRRFSPSQVPIQTRSFVC 

FGDDGEPQLKESKPKEEVKKEELESKGTLEQRG 

HNPEEKEDCPFESTVSEVLSLPVTETVCLTPNEDQ 

LNQPTEPPPKPVFPPTAPKNVNLIEVSLSDLKPPE 

KADVPVEKYDGESDKEQFDDDQKVCCGFFFKD 

DQKAENDMAMKRAALLEKRLRREKETQLRKQQ 

LEAEMEHKKEETORKTEEERQKKEDERARREFIR 

QEYMRRKQLKLMEDMDTVIKPRPQVVKQKKQR 

DiTQixipriTTrpQPi^'rpTK'rjppvQQi AQi xprnniMPQ 
lrJ\.oJLnlU^xlJJDorivli:^IJs.\^ 1 OJJINiZro 

VHSGKRTPRSESVEGFLSPSRCGSRNGEKDWEN 

ASTTSSVASGTEYTGPKLYKEPSAKSNKHnQNAL 

AHCCLAGKVNEGQKKKILEEMEKSDANNFLILF 

lEGLYKYNSDRKOFSHIPAKTLSASVDAITIHSHL 
WQTKRPVTPKKLLPTKA 


3369 


A 


977 


594 


RGSGLTQEPGSVGQLALACAEGAVEWLYPAGAL 
RLTLGGPDPRARPGIACLRPVRPFAGAQVFAERA 
GGALELLLAEGPGPAGGRCVRWGPRERRALFLQ 
ATPHQDISRRVAAFRFELREDGRPEIAP 


3370 


A 


345 


1383 


DLSI^CTGFKETNLGV!(TI^SKW1JILYALHIID 
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SEQED 
NO: 



PCT/USOl/04098 



Predicted 
beginning 
nndeotide 
location 
corresponding 
to first amino 
add residue of 
peptide 
seqnence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid rcsidoe of 
peptide 
seqnenee 



3371 



345 



1383 



239 



3348 



3373 A 



587 



1584 



Amino acid sequence (A=Alanine Cy steine, i^paroc /^cu. 
iSJutemic Add, F=Phcnylalanine. G-Glyd|...H-ttst.d.Be. 
l=lsoleucine,K=Lysine,LF=Leucine,ftfr=MeflUonlne, 
NSSe.p4oHne,Q=Glutami«e,R-Arginne,S=Sen^ 
T=Threonine. V=VaIine, W-Tryptophan, V=-^ine. 
X=Unlaiown, *=Stop codon, ATWsslble nucleotide deletion, 
\Fpossible nucleotide insertion 



YSAVLFPC* AMDHLESFlAliCDRRTJiLAiiu<au^E 

TOEEISAEVSAKAEKVHELNEEIGKLLAKAEQLG 

AEGNVDESQKE.MEVEKVRAKKKEAEKTVAEK 

OEKRNQDRLRRREEREREERLSRRSGSRTODRRR 

SRSRDRRRRRSRSTSRERIOCLSRSRSRDRHRWm 

SRSRSHSRGHRRASRDRSAKYKFSBmAS^ESW 

ESGRSERGPPDWRLESSNGKMASRRSEEKEAG/G 

DLLNRMIVWKHGLLI 



DLSLECTGFKE TNLGVYFLSSKW VLW. ALHUU 
YSAVLFPC* AMDHLESFIAECDRRTELAKKRLAE 
TOEEISAEVSAKAEKVHELNEEIGKLLAKAEQLG 
AEGNVDESQKILMEVEKVRAKKKEABKTVAEK 
OEKRNQDRLRRREEREREERLSBRSGSRTRDRRR 
SRSRDRRRRRSRSTSRERRKLSRSRSRDRHRRHR 
SRSRSHSRGHFRASRDRSAKYKFSRERASREESW 
ESGRSERGPPDWRLESSNGKMASRRSEEKEAG/G 

DLLNRMIVWKHGLLI 



PMONCMCSLTL SVLPLGPQPPVPbKKPPblQHFR 
MSDDVHSLGKVTSDLAKRRKLTS\»GGLSEELGS 
ARRSGEVTLmGDPGSLEEWETVVGDDFSLYYD 
SYSVDERVDSDSKSEVEALTEQLSEEEEEEEEEEE 
EEEEEEEEEEEEEDEESGNQSDRSGSSGRRKAKK 
KWRKDSPWVKPSRKRRKREPPRAKEPRGVNGV 
GSSGPSEYMEVPLGSLELPSEGTLSPNHAGVSND 
TSSLETERGFEELPLCSCRMEAPKIDRISERAGHK 
CMATESVDGELSGCNAAILKRETMRPSSRVALM 
VLCETTIRARMVKifflCCPGCGYFCTAGmECHP 
bFRVAHRFHKACVSQLNGMVFCPHCGEDASEA 
OEVnPRGDGVTPPAGTAAPAPPPLSQDVPQRAD 
-reOPSARMRGHGEPRRPPCDPLADTOSSGPSLTL 
PNGGCLSAVGLPLGPGREALEKALVIQESERRKK 
LRFHPRQLYLSVKQGELQKVE.MLLDNLDPNFQS 
DQQSKRIPLHAAAQKGSVEICHVLLQAGAI^A 
VDKOORTPLMEAVVNNHLEVARYMVQRGGCV 
YSKEEDGSTCLHHAAKIGNLEMVSLLLSTGQVD 

VNAQDSGGWTPIIWAAEHKfflEVna^TRG/^^ 

VTLTDNEENICLHWASFTGSAAIAEVLLNARCDL 

HAVNYHGDTPLHIAARESYHDCVLLFLSRGANP 

ELRNKEGDTAWDLTPERSDVWFALQLNRKLRL 

SvoSmTEKnCRDVARGYENVPIPCVNGVDG 

EPCPEDYKYISENCETSTTvmroRNITHLQHCTCy 

DDCSSSNCLCGQLSIRCWYDKDGRLLQEFNKIEP 

PLIFECNQACSCWRNCKNRVVQSGIKVBLQLYR 

TAKMGWGVRALQUPQGTnCEYVGELISDAEAD 

VRHDDSYLFDLDNKDGEVYCIDARYYGmRFIN 

HLCDPNIIPVRVFMLHQDLRFPRIAFFSSRDIRTGE 

ELGFDYGDRFWDIKSKYFTCQCGSEKCKHSAEAI 

At pnt;BT.ABT. DPHPELLPELGSLPPVNT 

TOSlVSCSEDKTlKlWDTrNKQCVNNi-SUSvG 
FANFVDFNPSGTCIASAGSDQTVKVWDVRVNKL 
LOHYOVHSGGVNCISFHPSGNYLITASSDGTLKIL 
DLLKGRUYTLQGirrGPVFTVSFSKGGELFASGG 
ADTQVLLWRTNFDELHCKGLTKRNLKRLHFDSP 
PHLLDIYPRTPHPHEEKVETVEDFFLHLLRLIQSL 
i>*QTrBST.T.PLLWISFLLILPQQQKPWGLCQTRV 
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~&£QID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C=Cysteine, D»Aspartic Acid, 
£>=Glutamic Add, F-Pheaylalanine, G^Glycine, H^BBstidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=GlutamiDe, R'^Arginine, S==Serine, 
T=Thrconine, V=VaIine, \V=Tryptophan, Y—Tyrosinc, 
X=Un known, *^top codon, A^ssible nucleotide deletion^ 
\:^o$sible nucleotide insertion 










'K'ljpvnT^^'n P*PMO>A/pnnpR'K'PK'nT^T*VT^pv 

jtviva V jL'i.o 1 Lttr v-^nv^iN v wv^v^jr i\jfwJVPK.\^rk. i v i djt v 

KVKA^SIPLAVTDALEHIMEQLNVLTQTVSILEQR 

LTLTEDKLKDCLENQQKLFSAVQQKS 


3374 


A 


398 


21 


WLYPlVL^d.SILDimSPSWYFHMMGIINWNT^^ 
LSGTLYPKVPQKYILFDSVILLLGMLRKIRQVCQ 
NVYMKGCSPITLFKIVHYWPGAVAHAYNPSTLG 
. GQVGAVQIT*GQEFETSLDYMVKPHLY 


3375 


A 


3 


1051 


VPTQQILAFPEQTNTKDWTVTPEHVLPESQSLLT 
FEEVAMYFSQEEWELLDPTQKALYNDVMQENY 
ETVISLALFVLPKPKVISCLEQGEEPWVQVSPEFK 
DSAGKSPTGLKLKNDTENHQPVSLSDLEIQASAG 

VloJMSAivVJvVrV^JS. 1 AOisJDJNilrJUJVltlKV OJ^WrlV^ 

DFPVKKRKKLSTWKQELLKLMDRHKKDCARE^ 

PFKCQECGKTFRVSS\DL\IKHQRIHTEEKPYKCQ 

QCDKRFRWSSDLNKHLTTHQGIKPYKCSWGGKS 

FSQNTNLHTHQRTHTGEKPFTCHECGKKFSQNS 

HLIKm^THTGEQPYTCSICRRNFSRRSSLLRHQK 

LHL*REACPVSHFWKTF 


3376 


A 


137 


2329 


SFESPAPLPSTCFPQERQDPGPCYVSGAMAGLGP 
GVGDSEGGPRPLFCRKGALRQKVVHEVKSHKFT 
ARFFKQPTFCSHCTDFIWGIGKQGLQCQVCSFW 
HRRCHEFVTFECPGAGKGPQTDDPRNKHKFRLH 
SYSSPTFCDHCGSLLYGLVHQGMKCSCCEMNVH 
RRCVRSVPSLCGVDHTERRGRLQLEIRAPTADEI 
HVTVGEARNLIPMDPNGLSDPYVKLBCLIPDPRNL 
OXQKTRTN^TT-NPVWNETFVFNLKPGDVERRL 
:SVEVWn^XnDRTSR>roFMGAMSFGVSELLKAPW 
GWYKLLNQEEGEYYNVPVADADNCSLLQKFEA 
CNYPLELYERVRMGPSSSPIPSPSPSPTDPKRCFFG 
ASPGRLHISDFSFLMVLGKGSFGKVMLAERRGSD 
ELYAIKILKKDVIVQDDDVDCTLVEKRVLALGG 
RGPGGRPHFLTQLHSTFQTPDRLYFVMEYVTGG 
DLMYHIQQLGKFKEPHAAFYAAEIAIGLFFLHNQ 

\Jll I KJUL, JS-LJJ In V MJLJLi AJ&OJniJsJ i Lfr ^JJVIU JSJiiN V r Jr 

GTTTRTFCGTPDYIAPEIIAYQPYGKSVDWWSFG 

KSLSREAVAICKGFLTKHPGEAPGASGP*WGKLT 
IRAHGFFPLGFDWERLERL\EIPASFSRPRPCGPQR 
RGIFDKFFTRAAPA\LTPPARLVLDSIDQADFQGF 
TYVNPDFVQPDARSPTSTVHVPVM 


3377 


A 


918 


738 


SSMLWGFSVFRRSWILNCWLSSSQVGISAACKFS 
TLTHTHTHTHTHTRHAPFCGTCLYY 


J J JO 


A 

A 


1 lOA 




FCVT lA/fVTFTTnTQflVTMQriVTTT AVXTT OK'WT PTSir* 
r oJMLfiMJv 1 r Xlurl V 1 JN o w Jv 1 i ljt/vl\J>l J^l^ JsJrlXJrXN 

SVISQDDFFKPESEIETDKNGFLQYDVLEALNME 

KMMSAISCWMESARHSVVSTDQESAEEIPILIffiG 

FLLF>rm^LDTIWmSYFLT]PYEECKRRRSTRVY 

QPPDSPGYFDGHVWPMYLKYRQEMQDITWEW 

YLDGTKSEEDLFLQVYEDLIQELAKQKCLQVTA* 

RRNTTNPS/CK*IRKLOGVI 


3379 


A 


1126 


456 


FSKLIMKTFIIGISGVTNSGKTTLAKNLQKHLPNC 

SVISQDDFFKPESEIETDKNGFLQYDVLEALNME 

KMMSAISCWMESARHSWSTDQESAEEEPILIIEG 

FLLFNYKPLDTIWNRSYFLTIPYEECKRRRSTRVY 

QPPDSPGYFDGHVWPMYLKYRQEMQDITWEW 

YLDGTKSEEDLFLQVYEDLIQELAKQKCLQVTA* 
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SEQID 1 M 
NO: 


Icthod 1 f 

1 k 
1 

n 
1 

c 
t 

1 ^ 
I 

1 ^ 


redicted 

eginning n 
udeotide l< 
Dcation c 
orresponding t 
0 first amino a 
cid residue of F 
>eptide 1 s 
equence 1 


redicted end a 
ncleotide E 
>catioii I' 
orresponding ^ 
9 last amino 1 
dd residue of J 
teptide V 
equence 


mino acid sequence (A=Alanine CMJysteme, UMspai Uc Aad, 
Glutamic Acid, F=PhenylaIanine, G=Glycme, H=Ifct.dine, 
=lsoleucine,K=Lyslne,U=LeBdne,M=Methionlne. 

CTh«ODlne,V=VaIine,W=T.7ptophan.-)^^«^^^^^ 
L=Unluiown, *=Stop codOB,/=posaibIe nucleotide deletion, 
^possible nndeotide insertion 

jp>rrrNPR/CK.*IRKLQGVl 


.3380 > 




XAAt 1 ' 
IViJ 1 


794 > 

: 


VRAPAAGQPRAmGAPPPPGTPPPSPMSSAIERKS 
LDPSEEPVDEVLQIPPSLLTCGGCQQNIGDRYFLK 
AIDOYWHEDCLSCDLCGCRLGEVGRRLYYKLGR 
KLCTKDYLRLFGQDGLCASCDKRIRAYEMTMRV 
KDKVYHLECFKCAACQKHFCVGDRYLUNSDIV 

rEODIYEWTBCINGMI 


3381 


A. 


94S 


474 


SLKLRKPPLFTDGVHFVFVHSQLDFWGPQbML 1 
QQGMALQNYDNKLVKCIEELCQKQEELCWQIQ 
OTEDKKQBLQNEVRQLTEKLACVNEKLARVNE 
iSL^ASCaO^QTIAETEATYLKILESFnTLLS 
vPTfRFAmMLTKATAPDQKSSGGRDS 


3382 


A 


1 


14S8 


GmGKMADRGGVGEAAAVOASPASVPGLNJ|^lA, 

WRERLRAGLAGTGASLWFVAGLGLLYALRPLR 

LCENLAAVTVFLNSLTPKFYVALTGTSSLISGLIFI 

FEWWYFHKHGTSFIEQVSVSHLQPLMGGTESSIS 

EPGSPSBNRENETSRQNLSEGKVmL^^^ 

EYRRYTWVTGKEPLTYYDMNLSAQDHQTFFTC 

DTDFLRPSDTVMQKAWRERNPPARKAAYQALE 

LN/E*LCHCICSTG*GRSNNYCRC*KVI*TGTQGR 

RM^L*AVTAWAPKSSA*SSTEERY^TGIY*LKI 

GNVCKKIRKNKRSSKNNERFDE*ISSSYHVEHP* 

KSLVKSLLELQAYPDVQAVLAKYDDISLPKSAATC 

YTAALLKTRTVSEKFSPETASTRGLSAAEINAVD 

AIHRAVEFNPHVPKYLLEMKSLILPPEHILKRGDS 

SoAYAFFHLQHWKRIEGALNLLQCTWEGSKYS 

ppKVTt.TSTTffl 


3383 


A 


282 


2443 


RGKGFKEFFLGVCQTFlPCLCAEGIQLQU-CSUb^j 

SSPLLKDLESMKTGLFFLCLLGTAAAIPTNARLLS 

DHSKFTABTVAPDNTAIPSLRAEAEENEKETAyS 

m)DSHHKAEKSSVLKSKEESHEQSAEQG\KSS\S 

OELGIEGFKRDSDGSL*VWNLVEYGTNLKGTLDI 

KEDMSEPQEKKLSENTDFLAPGYSSFTDSNQQES 

ITKREENQEQPRNYSHHQLNRSSKHSQGLRDQG 

NQEQDPMSNQEEEEEKEPGEVGTHNDNQERKTE 

MJPREHANSKQEEDNTQSDDILEESDQPTQVSKM 

OEDEFIXJGNQEQEDNSNAEMEEENASNVNKHIQ 

ETEWQSQBGKTGLEAISNHKETEEKTVSEALLNffi 

PTDDGNTTPRNHGVDDDGDDDGDDGGTDGPRH 

SA\SDDYFHPKPGLFWEAERA\HSIAYSPSKLREQ 

REKVHENENIGTTEPGEHQEAKKAENSSNEEETS 

SEGNMRWHAVDSCMSFQCKRGHICKADQQGKT 

SLVSCQDPV^^CPPTKPLDQVCGTDNQTYA^^ 

LFATKCRLEGTKKGHQLQLDYFG\ASKSIPTyCRI) 

FEVIQ\FPLRMRDWVLKNILMQLYEANSEHAG-inL 

NEK\QRl^VKKIYL\DEKRLIAGDHProLIXOT 

KKYHMYVYPVHWQFSELDQHPMDRVLTHSaA 

PLRASLVPMEHdTRFFEECDPNKDKHITLKEWG 

TT/-«t?nn>^T7T?nrnP.MT J .F 


3384 


A 


3166 


928 


PSRPHPTHAAMAGPEGFQVKALyPFRB£lcebUJ.n 
LLPGDVLWSRAALQALGVAEGGERCPQSVGW 

MPGLNERTRQRGDFPGTYVEFLGPVAL|^PR 
PRGPRPLPARPRDGAPEPGLTLPDLPEQFSPPDVA 
Pn.LVKLVEAIERTGLDSESHYRPELPAPRTDWSL 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nn dec tide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence . 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence <A=Alanine C=Cysteine, D^Aspartic Acid, 
E=Glutamic Add, F=Phenylalanine, G=Glydne, H-Histidine, 
I=l50leudne, K^Lysinc, Lr=Lcudne, M=Methionine, 
N^Asparagine, P=ProUne, Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V=Vallnc, W«Tryptophan, Y^Tyrosine, 
X^'Unknown, *»Stop codon, ^possible nndeotide deletion, 
\«possible nudeotide insertion 










SDVDQWDTAALADGIKSFLLALPAPLVTPEASAE 

ARRALREAAGPVGPALEPPTLPLHRALTLRFLLQ 

HLGRVASRAPALGPAVRALGATFGPLLLRAPPPP 

SSPPPGGAPDGSEPSPDFPALLVEKLLQEHLEEQE 

VAPPALPPKPPKAK\PASTVPGPNGGSPPSL\QDA 

EWYWGDMSREEVNEKLRDTPDGTFLVRDASSKI 

QGEYTLTLRKGGNNKLDCVFHRDGHYGFSEPLTF 

CSWDLINHYRHESLAQYNAKLDTRLLYPVSKY 

QQDQIVKEDSVEAVGAQLKVYHQQYQDKSREY 

DQLYEEYTRTSQELQMKRTAIEAFT^IKIFEEQG 

QTQEKCSKEYLERFRREGN/QTKEMQRILLNSER 

PH*TSLKPDLMQLRKIRDQYLVWLTQKGARQKK 

INEWLGIKNETEDQYALMEDEDDLPHHEERTWY 

VGKINRTQAEEMLSGKRDGTFLIRESSQRGCYAC 

SVWDGDTKHCVIYRTATGFGFAEPYNLYGSLK 

ELVLHYQHASLVQHNDALTVTLAHPVRAPGPGP 

PPAAR 


3385 


A 


43 

t 


2372 


TRDVNSWKELCFNHYNKETTNCYRTTRKWTNY 

KIIFLGPERELRSQGNQVILNLGKERCQLRETGLK 

LYLPGMDSARHHISHSTSAGPIPSQKEEEMTESQ 

GTVTFKJDVATOFTQEEWKRLDPAQRKLYRNVML 

♦NYNNLITVGYPFTKPDVIFKLEQEEKPWVMEEE 

VLRRHWQGEIWGVDEHQKNQDRLLRQVEVKFQ 

KTLTEEKGNECQKKPANVFPLNSDFFPSRHNLYE 

YDLFGKCLEHNFDCHNNVKCLMRKEHCEYNEP 

VKSYGNSSSHFAir^ 

LRfflTGEKPYEGSNCRKAFSHKEKLIKHYKlHSRE 

QSYKCNECGKAFIKMSNLIRHQRIHTGEKPYACK 

ECEKSFSQKSNLIDHEKIHTGEKPYECNECGKAFS 

QKQSLIAHQKVHTGEKPYACNECGKAFPRIASLA 

LHMRSHTGEKPYKCDKCGKAFSQFSMLIIHVRIH 

TGEKPYECOTCGKAFSQSSALTVHMRSHTGEKP 

YECKECRKAFSHKKNFITHQKIHTREKPYECNEC 

GKAFIQMSNLVRHQRIHTGEKPYICKECGKAFSQ 

KSNLIAHEKIHSGEKPYECNECGICAFSQKQNFIT 

HQKVHTGEKPYDCNECGKAFSQIASLTLHLRSHT 

GEKPYECDKCGKAFSQCSLLNLHMRSHTGEKPY 

VCNECGKAFSQRTFLIVHMRGHTGEKPYECNEC 

GKAFSQSSSLTIHIRGHTGEKPYECKECRKAFSHK 

KNFITHQKIHTRE/KPFKCNHCGKGFNQTLDLIRH 

LREHTGEKPYECSNCRKAFSHKJEKLIKHYKIHSRE 

QSYKO^CGKAFIKMSNLIRHQRIHTGEKPYACK 

ECEKSFSQKSNLroHEKIHTGEKPYECNECGKAFS 

QKQSLIAHQKVHTGEKPYACNECGKAFPRIASLA 

LHMRSHTGEKPYKCDKCGKAPSQFSMLIIHVRIH 

TGEKPYECNECGKAFSQSSALTVHMRSHTGEKP 

YECKECRKAFSHKKISIFITHQKIHTREKPYECNEC 

GKAFIOMSNLVRHORIHTGEKPYICKECGKAFSO 

KSNLIAHEKIHSGEKPYECNECGKAFSQKQNFIT 

HQKVHTGEKPYDCNECGKAFSQIASLTLHLI^HT 

GEKPYECDKCGKAFSQCSLLNLHMRSHTGEKPY 

VCNECGKAFSQRTFLIVHMRGHTGEKPYECNEC 

GKAFSQSSSLTIHIRGHTGEKPYECKECRKAFSHK 

KNFITHQKIHTElENPLSVnVEKASIRLWTSSDI 
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SEQID 
NO: 



Method 



3386 



3387 



predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 



201 



T=T*^Jntae,V=Valine,W=Tryptopha^V=TjTosiiie, 
X=?rn3rn?*==Stopcodon,A=possiWenud«,1ld.deleaoB. 

\Fpossible nudeotide insertion 



1032 



86 



3388 



98 



3197 



OTWQGALI^AAEULHtLOPPbKVKOgLR 



GITGPAWYCHSPSHSLLSAFCHLPTPSRCPAMAR 
?i^SV^A^lWHESmRGQGVPGLHSAQEP 
^•SSAAAAVLSIDTASYKIFVSGKSGyGKT 
ALvALAGLEVPWHmTTGIQTrVWWPAK^ 

NTOAFLFLFSFTDRASFEDLPGQ^ 
VSmGSKFDQYMHTDVPERDLTAFRQAWELPL 



Sgpegaqerpsqaapav^^ 
JS^qSrdvseelsrqledilstycvdwj 

ogSgedgaqgepaepedaeksrtyvabng^e 

?^^GEK^PSKGDPNTEEIRQSDEVGDM)HRR 

reSSxkeaVesqrmcelmkqqeth^ 
SS5toleklcralqt/gaq*pvrgqrw 



gshrtsavrifs 



, ARPEVPAPPAWL SRRGAAKMUUKkUi^';^,;^^^^ 
ffiBiWDDLKKEVA^mHKMSVEE^^^ 

' I^Sqglthskaqeilardgpnaltppptwew 
wSSggfsillwigailcflaygiqagtedd 

SSGm.AAWnTGCFSYYQEAKSSKBffi 

sSSvpqqalviregekmqvnaeevvvgdlv 

ISSGDRSRnSAHGCKVDNSSLTGESEPQT 
SDCTroS^KTRNITWSNNFVEGTARGVVVA 
?S)RTSu^TLASGLEVGKTPIAffiIEIffI^^ 
SvAmGVSFFILSLILGYTWLEAVIFLro^^ 

pegiIatvtvcltltakrmarknclvknleaa^ 

?^GSsicskTG'ILTQNRMTVAHMWFDNQIH 

SdSSgSfdksshtwvalph/llgfc^^^ 

?VFKGGODNIPVLKRDVAGDASESALLKCIELSS 
NDNRYLLVMKGAPERILDRCSTILLQGI^QPLDE 

Seafqnaylelgqlqervlgfchyylpeeqf 

pSo^DVl^mCFVGLMSMGP^^^ 

vpdavgkcrsagikvimvtgdhpitakaiakgv 

SffSrVEDIAARLNPVSQVNPRDAKACVm 
GTOLSpTSEQIDEILQlWTCIVFARTSPQQKLny 

pSoSga^avtqdgvndspalkkadigvam 
gSv^kqaadmillddnfasivtgveegru 

^^^lAYTLTSNPEnPFLlJFIMAMPLPLOT 

tSE^RLISmaygqigmiqalggffsyfviia 
SSi^pgnlvgirlnwddrtvndledsygqqw 

g^VFQQGMKMaUFGU^ETALAAFLSYCPGM 
S AT rfiv°- i^qwwppappYSFLIFVYDEIRKLI 
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SEQID 
NO: 



Method 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 



Amino acid sequence (A'^AIanine C=Cysteine, D==Aspartic Add, 
£==Glutamlc Acid, F=Phenylalanine, G=Glycine, H^Histidine, 
I-Isoleudne, K=Lysine, L^Leudne, M=Metbionine, 
N=Asparagine, P=ProIinc, Q=Glutarainc R^Aiiglnine, S=Serine, 
T«=Threonine, V=VaUne, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, A=pos^ble nucleotide deletion, 
\«possible nucleotide insertion 



LRRNPGGWVEKETYY 



3389 



45 



5250 



VERLLGCRNSKRTWRMLISKNMPWRRLQGISFG 

MYSAEELKKLSVKSITNPRYLDSLGNPSANGLYD 

LALGPADSKEVCSTCVQDFSNCSGHLGHIELPLT 

VYNPLLFDKLYLLLRGSCLNCHMLTCPRAVIHLL 

LCQLRVLEVGALQAVYELERILNRFLEENPDPSA 

SEIREELEQYTTEIVQNNLLGSQGAHVKNVCESK 

SKLIALFWKAHMNAKRCPHCKTGRSWRKEHNS 

KLTITFPAMVHRTAGQKDSEPLGffiEAQIGKRGY 

LTPTSAREHLSALWKNEGFFLNYLFSGMDDDGM 

ESRFNPSVFFLDFLVVPPSRYRPVSRLGDQMFTN 

GQTVNLQAVMKDVVLIRKLLALMAQEQBCLPEE 

VATTTTDEEKDSLIAIDRSFLSTLPGQSLIDKLYNI 

WIRLQSHVNIVFDSEMDKLMMDKYPGIRQILEK 

KEGLFRKHMMGKRVDYAARSVICPDMYINTNEI 

GIPMVFATKLTYPQPVTPWNVQELRQAVINGPN 

VHPGASMVINEDGSRTALSAVDMTQREAVAKQ 

LLTPATGAPKPQGTKIVCRHVKNGDILLLNRQPT 

LHRPSIQAHRARILPEEKVLRLHYANCKAYNADF 

DGDEMNAHFPQSELGRAEAYVLACTDQQYLVP 

KDGQPIAGLIQDHMVSGASMTTRGCFFTREHYM 

ELVYRGLTDKVGRVKLLSPSILKPFPLWTGKQVV 

STLLINIIPEDHIPLNLSGKAKITGKAWVKETPRSV 

PGFNPDSMCESQVIIREGELLCGVLDKAHYGSSA 

YGLVHCCYEIYGGETSGKVLTCLARLFTAYLQL 

YRGFTLGVEDBLVKPKADVKRQRIIEESTHCGPQ 

AVRAAtNLPEAASYDEVRGKWQDAHLGKDQRD 

FNMIDLKFKEEVNHYSNEINKACMPFGLHRQFPE 

NTLQLMVQSGAKGSTVNTMQISCLLGQIELEGRS 

TPLMASGKSLPCFEPYEFTPRAGGFVTGRFLTGIK 

PPEFFFHCMAGREGLVDTAVKTSRSGYLQRCIIK 

HLEGLWQYDLTVRDSDGSWQFLYGEDGLDIP 

KTQFLQPKQFPFLASNYEVIMKSQHLHEVLSRAD 

PKKALHHFRAIKKWQSKHPNTLLRRGAFLSYSQ 

KIQEAVKALKLESENRNGR/RPWDS/G/RMLRMW 

YELDEESRRKYQKKAAACPDPSLSVWRPDIYFAS 

VSETFETKVDDYSQEWAAQTEKSYEKSELSLDR 

LRTLLQL\KWQRSLCEPGEAVGLLAAQSIGEPST 

QMTLNTFHFAGRGEMNVTLGIPRLREILMVASA 

MKTPMMSVPVLNTKKALKR\^LKKQLTRVCL 

GEVLQKIDVQESFCMEEKQNKFQVYQLRFQFLP 

HAYYQQEKCLRPEDILRFMETRFFKLLMESIKKK 

NNKASAFRNVNTRRATQRDLDNAGELGRSRGE 

QEGDEEEEGHIVDAEAEEGDADASDAKRKEKQE 

EEVDYESEEEEEREGEENDDEDMQEERNPHREG 

ARKTQEQDEEVGL/GH*GGPVPSRPPDAAPETHP 

QPGAPGA\EAMERRVQAVREIHPFIDDYQYDTEE 

SLWCQVTVKLPLMKINFDMSSLVVSLAHGAVIY 

ATKGITOCLLNETTNNKNEKELVLKIEGINLPELF 

KYAEVLDLRRLYSNDIHAIANTYGIEAALRVIEK 

EIKDVFAVYGIAVDPRHLSLVADYMCFEGVYKP 

LNRFGIRSNSSPLQQMTFETSFQFLKQATMLGSH 

DELRSPSACLWGKWRGGTGLFELKQPLR 



3390 



2080 



DLPPLEGPPAQASPSSnVILGEGSQPDWPGGSRYD 
LDEIDAYWLELINSELKEMERPELDELTLERVLE 
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SEQQ) 
NO: 



Method 



Predicted 

beginning 

nucleotide 

location 

correspondiiig 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



3391 



1555 



327 



3392 



218 



1773 



l^tnn «Md seauence (A=Aianm e OCysteine, D^Aspartic Add, 

T-T^«ontoe. V=Valine, W=Tryptophaii, Y=Tyrosine, 
V=possible nucleotide insertion 



W^SL^CTGTCIQCSMPSCWTAFHVTCAF 
D^SmSSDEvWKSFCQEHSDGGPRNE 

SSSSaglstsfpidgtffnswlaqsv 

Q?SSMSEWPL>mGHREDPAPGLLS^U^^ 

RLRPPPREPR*T\RRLPGC/ARPDAGDGDHLSAVA 
S\SuHFDTElDG\YFS\DGEMSNS\DV\EAED 
nnvORGPREAGAKEWVRMGVLAS 



DaSsfflSTQELGVCGCPFRGVSCLVGELALVQA 

S5^iGH)CVILLl^NGLWNGISCTSSFIAICE 
FPA 



LVSVOTQAGia>DALTKLEQGEPLWTLED™^ 
SS5^S,TEHiRAHRGEKPHGCSLCGK^ 

SSJrltcherahkgekpygcsecgkafpri^e 

ySJSfflTGIKPHQCSECGRAFSRKSLLVVHQR 
^^GKAFSQKSCLVAHQRYHTGKTP^^^^^^ 

^g^S£GD/CSEGGKSFHSKSQLKS-TC 
Ar.KKPC*YGNCGNGGRAV 



1464 



OTSG^SGS^^GTSllR^^^^ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine OCysteine, D»Aspartic Add, 
E>=GIutamic Add, F=PhenylaIamnc, G=Glycine» H»Histidine, 
I=IsoleQcine» K*=Lysinc, L^Leudne, M=Methionjne, 
N-Asparagine, P=Prolinc, Q=Glutamine, R=Arginine, S=^erine, 
T=Threonine, V^Valine, W^Tryptophan, Y=Tyrosinc, 
X=Unknown, ^^^Stop codon» /=possible nudeotide deletion, 
\=possible nudeotide insertion 










LAVSLNKRDLFPMGSPIPWVSVP\NNTEKPVKKI 

KA\SVEQVANVVLYS\SDY\YVKPVAMEEAQEKV 

PPNSTWTKA\LTLL\PWLVNNKERRGIALDGKIKH 

EDTNLASSTIIKEGIDRKRSWEILVSYPDQR*SSTV 

SGFLGRASPSQ*SRPT*RSQFRL\MHPQP\EDPA\K 

ESYQDANLVF\EEFAIU'*ILKDAGEA*\EGKRDQE 


3394 


A 


211 


1591 


RPPTMAADQRPKADTLALRQRLISSSCRLFFPEDP 

VKIVRAQGQYMYDEQGAEYIDCISNVAHVGHCH 

PLWQAAHEQNQVLNTNSRYLHDNIVDYAQRLS 

ETLPEQLCVFYFLNSGSEANDLALRLARHYTGH 

QDVWLDHAYHGHLSSLEDISPYKFRNLDGQKE 

WVHVAPLPDTYRGPYREDHP\THVEDGLEKAFS* 

KRWQGRMIQICRRQIAAFFAESLPSVGGQIIPPA 

u Yr ov^ V AJDJnUUKJ\A\jlJ Vr V AUmV^ V Ur UJv V ijJvrlr 

WAFQLQGKDFVPDIVTMGKSIGNGHPVACVAAT 

QPVARAFEATGVEYFNTFGGSPVSCAVGLAVLN 

VLEKEQLQDHATSVGSFLMQLLGQQKDCHPIVG 

DVRGVGLFIGVDLKDEATRTPATEEAAYLVSRL 

KENTVn/LLSTDGPGRimKFKPPMCFSLDNARQV 

VAKLDAILTDMEEKVRSCETLRLQP 


3395 


A 


1 


1424 


FRDGFSLRCGCNAELPGRGGDDAADRAIQRFLR 

TGAAVRYKVMKNWGVIGGIAAALAAGIYVIWG 

PITERKKRRKGLVPGLVNLGNTCFMNSLLQGLSA 

CPAFIRWLEEFTSQYSRDQKEPPSHQYLSLTLLHL 

LKALSCQEVTDDE\a,HASCLLDVIJlMYRWQISS. 

FEEQDAHELFmafsSLE^^ 

SLE\HSQK*LPKQITCRTRGSPHPTSNHWKSQHPF 

HGRLTSNMVCKHCEHQSPVRFDTFDSLSLSIPAA 

KGTLNGEKVEHQRTTFVKQLKLGKLPQCLCIHL 

QRLSWSSHGmKRHEHVQFT^FLMMDIYKYHL 

LGHKPSQHOTKLNKNPGPTLELQDGPGAPTPGL 

NQPGAPKTQIFMNGACSPSLLPTLSAPMPFPLPV 

VPDYSSSTYLFRLMGSCRPPWETWHSGTLCSFTD 

GPHL 










3396 


A 


109 


107 


TQEAGLIFFSPPFSLSLSLSLPLSLFLLSHPHSRTPP 

NRTPRRTRIPQRPAVMYSPLCLTQDEFHPFIEALL 

PHVRAFAYTWFISn.QARKRKYFKKHEKRMSKEE 

ERAVKDELLSEKPEVKQKWASRLLAKLRKDIRP 

EYRBDFVLTVTGKKPPCCVLSNPDQKGKMRRID 

CLRQADKVWRLDLVMVILFKGIPLESTDGERLV 

KSPQCSNPGLCVQPHfflGVSVKELDLYLAYFVH 

AADSSQSESPSQAK*R*H*GPARKWDIWGFQ\DS 

FVT\SGVF\SVT*A*LRVSQTPI\AAG\TGPNFSLSD 

LESSSYYSMSPGAMRRSLPSTSSTSSTKRLKSVED 

T7^>rT^cp/*iT7T7PT7V'mr\np cpncricr\cc/^\irtJT7\/T7prj 
iljVLJL^oJrijJbJciJrr 1 1 UrV^OlOr OoOovaoU Willi V ilr O 

MPSPTTLKKSEKSGFSSPSPSQTSSLG\TAFTQHHR 

PVITGTOSKFHTATPSILVHFPRHSPFFOOPGPYFSH 

PAIRYHPQETLKEFYQLVCPDAGQQAGQPNGSS 

QGKVHNPFLPTPMLPPPPPPPMARPVPLPVPDTK 

PPTTSTEGGAASPTSPTTRS/PGRTJIPQQPFL/SYG 

PP*PSNALIGGGGGGAGERAGERADLEM 


3397 


A 


1 


2002 


TGTLTEDGLDVMGWPLKGQAFLPLVPEPRRLP 
VGPLLRALATCHALSRLQDTPVGDPh4DLKMVES 
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NO: 


ethod F 
b 
n 
1( 
c 
t 
a 
V 


redicted f 
eginning n 
udeotide U 
>cation c 
orresponding t( 
D first amino a 
dd residue of F 
peptide s 


redicted end a 
udeotide E 
)cation I' 
orresponding > 
[> last amino 1 
dd residue of > 
eptide \ 
eqnence 


mino acid sequence (A=Alanine CM^Vsteine, 1^=^P?™ 

X ' k 'A rr—vkanuioianinp f^Glvdne* H^Jclistifline, 
r=GIutamic Acid, l«=rnenyiaianine, u-wiywuc, ^ juu- 

=Isoleudne, K^Lysinc, L=Leudne, M-Mcthionine, 
NAsparaginc, P=Proline, Q=Glutaminc R^Arginine, S=Scnne, 
-Threonine, V=Valine, W=Tryptophan, Y-Tyrosme, 
[-Unlinown, *=Stop codon, h^h\t nudcotide deletion, 
^possible nucleotide insertion 










rCWVLEEEPAADSAFGTQVLAVMRPPLWiiiVW 

MvIEEPPVPVSVLHRFPFSSALQRMSVWAWPGA 

rOPEAYVKGSPELVAGLCNPETVPTDFAQMLQS 

VTAAGYRVVALASKPLPSVPSLEAAQQLTRDTV 

EGDLSIXGLLVMRNLLKPQTTPVIQALBRTRmA 

VMVTGDNLQTAVTVARGCGMVAPQEHLnVHA 

THPERGQPASLEFLPMESPTAVNGVKDPDQAAS 

YTVEPDPRSRHLALSGPTTGIIVKHFPKLLPKVLV 

OGTVFARMAPEQKTELVCELQKLQYCVGMCGD 

GANDCGALKAADVGISLSQAEASVVSPFTSSMA 

SIECVPMVIREGRCSLDTSFSVFKYMALYSLTQFI 

SVLILYTINTOLGDLQFLAroLVITTTVAVLMSRT 

GPALVLGRVRPPGALLSVPVLSSLLLQMVLVTG 

VOLGGYFLTLAQPWFVPLNRTVAAPDNLPNYEN 

TVVFSLSSFQYLILAAAVSKGAPFRXRPLTNNVPF 

LLASAL*SSVLVVLVLSPGLLHGPLALBNTrDTGF 

KLLLVGLVTLNFVGGLHAGERARPVPPRLPAPPP 

AOArASKKRFKOLERELAEQPWPPLPAGPLR 


3398 


A 


758 




FPFRMLTGYLYLMWRHKAt WbO lyRHPLPGGL 
KRRRRPGRGPWPAPGGQGVGPSAL*KAGSPPAN 
RPG0GEm3LISPKPVTEVLPDVQGAPVPVPPLPT 
PPSLPHLQNQPPm^QHYLLSFSWKPSQGPE*RA* 
PSPLPPAAI^DG*PGPASQGPDQPG\PCPPASLP 
TSPPGKQFQKIETRKHPPPRQQHKPKCTANRPLA 

SFL 


3399 


A 


906 


1091 . ^ ^_ 


HHHHHHHHHHHHHLVAFGKVQ*LQMtjFSSSSbs 
Rsr.rFWOARFSSYRTLHHHHHHHHHHHHH 


3400 


A 


1838 


325 


-'piCsVHRSPHGPStaCUUPQASLVPEPVPOOCtjb 
PEEMSWPPSGEIASPPELPSSPPPGLPEVAPDATST 
GLPDTPAAPETSTNYPVECTEGSAGPQSLPLPILE 
PVKNPCSVKDQTPLQLSVEDTTSPNTKPCPPTPTT 
PETSPPPPPPPPSSTPCSAHLTPSSLFPSSLESSSEQ 
KFYNFVILHARADEHIALRVSGRSWEALGVPDG 
ATFCEDFQVPGRGELSCLQDAIDHSAFIILLL'nSN 
\FDCR\LSLHQVNQAMMSNLmQGSQDCVlP\FLP 
\LESSPARLSSDTASLLSGLVRLDEHSQIFARKVA 
NTFKPHRLQARKAMWRKEQDTOALREQSQHLD 
GERMQAAALNAAYSAYLQSYLSYQAQMEQLQV 
AFGSHMSFGTGAPYGARMPFGGQVPLGAPPPFP 
TWPGCPQPPPLHAWQAGTPPPPSPQPAAFPQSLP 
FPAVPKPFPTASTAPPSEPKGWQP\LIIHHAQMVT 
RWfi*NKH\MWNORGSQAPEDKTOEAE 


3401 
3402 


A 

1 A 


153 

" 153 


1389 
1389 


- EWGWLGAAQPPbbbAEAEDQESPSSLCRhALAkl 
KKEISPLFIGMEKCSVGGLELTEQTPALLGNMAM 
ATSLMDIGDSFGHPACPLVSRSRNSPVEDDDDDD 
DWFffiSIQPPSISAPAIADQRJJFIFASSKNEKPQG 
NYSVIPPSSRDLASQKGNISETIVIDDEEDIETNGG 
AEKKSSCFBEWGLPGTKNKTNDLDFSTSSLSRSK 
VNAGMGNSGITTELTLKYimmri^TGISSWA 
GODVNniTYKTSL*NTNLGDVAKGLQSSNFGVNI 
0TYTPSLTPQTKTGVWLL1LVE*MWQETYFRME 
NLOLII/CPEDASTKKANVILPVESSKSFQEFYSTS 
CLSPCENNWNLKKGVFNKSRC-nCSKLAEVWIFI 
PKT T FRLTVnLTFKCYYVLFHLHNARVLDV 

- EWGWLGAAOPPlibbAEAEDOEJSi'SSLUKijAi.Aiii 
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S£Q£D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residae of 
peptide 
sequence 


Amino acid sequence (A^Alanine C=Cysteine, D^Vispartic Acid* 
£=Glutaraic Add, F^Phenylalanine, &=G!ycinc, H-Histidine, 
I^'Isolencine, K=Lysine, L>=Leucine, M=Mcthionine, 
N=Asparagine, P^ProIinc, Q=Glotaraine, R==Arginine, S==Serine, 
T=Threonine, V=Valinc, W=Tryptophan, Y^Tyrosinci 
X=Unknown, *=^top codon, A^possible nucleotide deletion, 
\=possible nucleotide insertion 










KKEISPLFIGMEKCSVGGLELTEQTPALLGNMAM 

ATSLMDIGDSFGHPACPLVSRSRNSPVEDDDDDD 

DWFIESIQPPSISAPAIADQRNFIFASSKNEKPQG 

NYSVIPPSSRDLASQKGNISETIVIDDEEDIETNGG 

AJtlJsJvooCr Id W u J-rrvj 1 IsJN Jv 1 INUi-»Ur & i ooi^Xols. 

WAGMGNSGITTELTLKYnTNVTTLETGISSVNA 

GQDVNinTyKTSL*NTNLGDVAKGLQSSNFGVNI 

QTYTPSLTPQTKTGV\NLLTLVE*MWQETYFRME 

NU^LmCPEDASTFaCANVILPVESSKSFQEFYSTS 

CLSPCEIWWNLKKGVFNKSRCTICSKLAEVWIFI 

PKLLFRLTVIILTFKCYYVLFHLHNARVLDV 


3403 


A 

•■ 


609 

- 


2765 


SRHCTPAERQNETHRAPDFAMSAVLGHQPPFFPA 

LTLPPNGAAALSLPGALAKPIMDQLVGAAETGIP 

FSSLGPQAHLRPLKTMEPEEEVEDDPKVHLEAKE 

LWDQFHKRGTEMVITKSGRRMFPPFKVRCSGLD 

KKAKYILLMDIIAADDCRYKFHNSRWMVAGKA 

DPEMPKRMYIHPDSPATGEQWMSKVVTFHKLKL 

TNMSDKHGFTILNSMHKYQPRFHIVRANDILKLP 

YSTFRTYLFPETEFIAVTAYQNDKITQLKIDNNPF 

AKGFRDTGNGRREKRKQLTLQSMRVFDERHKK 

ENGTSDESSSEQAAFNCFA\QASSPAA\PL*RTSNL 

KDRSPSRG*RATPEAEEQRGSTAPRPATRAKISP 

HPRRRSPAVTRAAPAVKAHLFAAERPRDSGRLD 

KASPDSRHSPATISSSTRGLGAEERRSPVREGVQA 

PAKVEEARALPGKEAFAPLTVQTDAAAAHLAQG 

PLPGLGFAPGLAGQQFFNGHPLFLHPSQFAMGG 

ArSSMAAAGMGrbLATVSGASTGVSGLDSTAM 

ASAAAAQGLSGASAATLPFHLQQHVLASQGLA 

MSPFGSLFPYPYTYMAAAAAA/SSAAASASVHRT 

D\T7Vrr XTTTV>rDT>DT "D ■\/OD"VCTT>\7*D\7T3r^/^ CCT T T^AT TJO 

MAAAAGPLDGKAAALAASPASWAVDSGSELNS 
RSSVTLSSSSMSLSPKLCAEKEAATSELQSIQRLVS 
GLEAKPDRSRSASP 


3404 


A 


1082 


1308 


LKXFLEVPQSYSLLLSSPFLQ\WRA*RPQNAIG*Q 
FIIKTLVFFGIMRSAGDVLSTQVSCALRB4RTAGC 
SHSSP 


3405 


A 


1553 


559 


PRPPTQRLSRFAPPCRTAEFPFRRRAWTRPAPPR 

ACTVVGRSSPVTGLAVGAAVAMLTVAARSRPFA 

PVLSATSRGVAGALTAP^MQATVPATPEQPVLDL 

JsJNPrLbKJboLoCjyAVKJKPL VrAbVCyb 

HTDDCVPDFSEYRRLEVLDSTKSSRESSEARKGFS 

YLVTGVTTVGVAYAAKNAVTQFVSSMSASADV 

LALAKmnCLSDPEG&NMAFKWRGKPLFVRHRT 

QKEEEQEAAVELSQLRDPQHDLDRVKKPEWVILI 

GVCTHLGCVPIANAGDFGGYYCPCHGSHYDASG 

RIRLGPAPLNLEVPTYEFTSDDMVIVG 


3406 


A 


83 


2671 


CLYPDFCRSVTCAMPCFTHRSCREDPGTSESREM 
DPVAFKDVAVNFTQEEWALLDISQKNLYREVML 

EEKVNEIKEDSHCGETFTPVPDDRLNFQKKKASP 

EVKSCDSFVCEVGLGNSSSNMNIRGDTGHKACE 

CQEYGPKPWKSQQPKKAFRYHPSLRTQERDHTG 

KKPYACKECGKJ^raYHSSIQRHMVVHSGDGPYK 

CKFCGKAFHWLSLYLIHERTHTGEKPYECKQCG 

KSFSYSATHRIHERTHIGEKPYECQECGKAFHSPR 
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SEQID 

I NO: 



3407 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
add residue of 
peptide 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



1426 



3408 



106 



4514 



PCTAJSOl/04098 

- Amino add sequence (A=Alan,ne C^Cystem e, P-^span^^^^^^ ^ 
E^lutamlc Acid, F-Phenylalanine, G=Glycme, H=Histidine, 
I=lsoIeucine,K=Lysine,I^Uudnc,M=Methioninc, 
N-Asparagine. P^ProUne, Q-Glutamine, R=Argin.ne, S=Senne. 
?.Th°eonine,V==Valine,W=Tryptophan^ 
X=Unknown, *=Stop codon, A=possible nudeotide deletion. 
\»possible nudeotide insertion 



SCHRHERSHMGbKAYQCKBCGl^MCmVRR 
ffiRTHSRKKLYECKQCGKALSSLTSFQTHIRNfflS 
^YECKTCGKGFYSAKSFQRHEKTHSGEKPY 
KCKQCGKAFTRSGSFRYHERTHTGEKPYECKQC 

gkaSsapnlqshgrtotgekpyeckecgkafef 

?£5LQSHERTQTHimSGERRYKCKICGKGF^^^ 
PKSFQRHEKTHTGEKLYE(yrATFSSSFSSSSSF*Y 
HERraTGEKPYKCEQCGKAFRAVSIL*MHGRTH 
PEEKPYECEQ*RKAFRSAPHL*IRGRTHNGEKPY 
ACKKCGKPFGSAQNLRIHERTQTHIMHSVERPYK 

cSrgfysaksfqtheksytgekpyeckqcg 

KAFVSFTSFRYHERTHTGENPYECKQFGKAFRSV 

KmRFHKRTHTGEKPCEYMKBLTLEGNTMNAS 

NVAKLSLLPVLFNIMKEFTT.GRNPISVSNVR1^LF 

LPLLFNMKGLTWERNPMSVCHVGKPSFLLVPTO 

IMKGLTLERSPMNISNVGKPSDQPRTFKCMEGLT 

T pgTJPMTsTVSSMGKRSDLTRFFEYR 



PAAPSGASPGRVCGVETARPLUVQKKQbAJJbOP 
PGVAGLRHEPPTVWLGSVAHRGTWVCAHRWFG 
PAVTRAAQAATMVKLLVAKILCMVGVFFFMIX 
GSLLPVKIIETDFEKAHRSKKILSLCNTFGGGVI^ 
ATC\LTALLARC*GKSSRRSWSLGH1STDYPL\A£ 
mLLGFFMTVFLEQLILTFAQENAVLHBPGDLQR 
RIGRGQRLGVEPLHGGRAGPRAVRGAPRPKPQP 

eragpL^npspvrllsiafalsahsvfeglalglq 

EEGEKWSLFVGVAVHETLVPVALGISMAGSAM 

PLRDAAKLASriVSPMIPLGIGLGLGffiKAQGVPG 

SVASVLLQGPGGRHLSLFITFPGKSWPRSV^S 

DRLLKVLFVLVVGYTVLAGMGLPQVVSGLArVPA 

AGSPPGAPGRTQAASPGRASPKSEHCGPGPPPVH 

KGPPQ-nRLCPRSYTLSLRALLLFKILLSLKSLYQK 



KK 



EARDRLAQSRAKEKBLNSVASKLSARQbbSbHSH 
KHLffiLRREFlOCNVPEEIREMVAPA^^Q/^W 
ALSKRSQEAEAAFLSVYKQLIEAPALWELKIJ^R 
PALGDSRVQQGQHDPKTDNQNTQQKAGFKEGW 
LAEASEREAFGPGFKDPVPVFEAARSLDDRLQPP 
iDPSGQPRRDLHTSWKRNPELLSPKALKATQAE 
LLELRRKYDEEAASKADEVGUMTNLEKANQRA 
EAAOREVESLREQLASVNSSIBLACCSPQGPSGD 
KVNFTLCSGPRLEAALASKDREILRLLKDVQHLQ 

Wleeasanqiadlerqltakseaiekle^ 

OAQSDYEEIKTELSILKAMKLASSTCSLPQGM^ 

hedsluakeaffpiqkfllekpsllaspeedpsed 

DSKDSLGTEQSYPSPQQLPPPPGPEDPLSPSPGQP 
LLGPSLGPDGTRTTSLSPFPSLASGERLMMPPAAF. 

kSgllvfppafygakpptapatpapgpeplg 

GPEPADGGGGGAAGPGAEEEQLDTAEIAFQVKE 

ollkhnigqrvfghyvlglsqgsvseilarpkpx 

WRKLHG**GKEPFIKMKQFLSDEQNVLALRTIQy 
RORGSrrPRIRTPETGSDDAIKSILEQAKKEIESQK 
GOEPKTSVAPLSIANGTTPASTSEDAKSILEQAR 

Sqaqqqallemevaprgrsvppsppekpsi^t 
Sgapalvkqeegsggpaqaplpvlspaafv 

OSnRKVKSEIGDAGYFDHHWASDRGLLSRPYAS 
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SEQID 

NO; 


Method 


Fredicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A^^^Alanine 0=Cysteine, D-Aspartic Acid, 
EN^lutamic Acid, F^Phenylalanine, G=Glycine, H^Histidine, 
I=Isoleucine, K-Lysinc, L=Leucinc, M^Methionine, 
N=Asparagine, P=ProIine, Q=<^lutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=^top codon, /=possible nndeotide deletion, 
\=possible nucleotide insertion 










VSPSLSSSSSSGYSGQPNGRAWPRGDEAPVPPED 

EAAAGAEDEPPRTGELKAEGATAEAGARLPYYP 

AYWRTLKPTVPPLTPEQYELYMYREVDTLELTR 

QVKEKLAKNGICQRIFGEKVLGLSQGSVSDMLSR 

PKPWSKLTQKGREPFIRMQLWLSDQLGQAVGQQ 

PGASQASPTEPRSSPSPPPSPTEPEKSSQEPLSLSLE 

SSKENQQPEGRSSSSLSGKMYSGSQAPGGIQEIV 

AMSPELDTYSmCRVKEVLTDNNLGQRLFGESIL 

GLTQGSVSDLLSRPKPWHKLSLKGREPFVRMQL 

WLNDPHNVEKLRDMKKLEKKAYLKRRYGLIST 

GSDSESPATRSECPSPCLQPQDLSLLQIKKPRVVL 

APEEKEALRKAYQLEPYPSQQTIELLSFQLNLKT 

NTVINWFHNYRSRMRREMLVEGTQDEPDLDPSG 

GPGILPPGHSHPDPTPQSPDSETEDQKPTVKELEL 

AGSQPQDSGELDKGQGPPKEEHPDPPGNDGLPK 
VAPGPLLPGGSTPDCPSLHPQQESEAGERLHPDP 
LSFKSASESSRCSLEVSLNSPSAASSPGLMMSVSP 
VPSSSAPISPSPPGAPPAKVPSASPTADMAGALHP 
SAKVl^NLQRRHEKMANLNNIIYRIJERAANREE 
ALEWEF 


3409 


A 


162 


1710 


GPLSPGPYQCRPSLPAQLYPQSLMAAATLRTPTQ 

GTVTFEDVAVHFSWEEWGLLDEAQRCLYRDVM 

LENLALLTSLDVHHQKQHLGEKHFISNVGRALF 

VKTCTFHVSGEPSTCREVGKDFLAKLGFLHQQA 

AHTGEQSNSKSDGGAISHRGKTHYNWGEHTKAF 

SGKHTT^VQOQRfeT^FSV^^^ 

NDHWRLHTGEKPYECRECGKSFRQSSSLIQHRR 

GHTAVRPHECDECGKLFSNKSNLIKHRRVHTGE 

RPYECSECGKSFNQRSALLQHRGVHTGEKPYEC 

SQNSSLIEHHRVHTGERPYKCSECGKSFRQRSAL 
LQHRGVPTGERPYECSECGKFFPYSSSLGKHQRV 

htgsrpyecsecgksftqnsglikhrrvhtgekp 
yecte*kksfshnsslikhqrjhsr*kpye\ckcg 
n\r*hpgesp*vhsecq/k:sfs*rpyliechtvhkg 
ktllicrdvqli 




A 




/ 07 


LCMKGISGGVRVAAIJ^ARAEREELPVPA^lEPOP 

tawgsphpeavlqlevapessgpctdtakdqqs 

dklpdlmppa\eplgsalelrasleidvae\rgce 

hgpsqqlprcp*swawsepwcqrpgcav*aplp 

y*reasfiyqshspaasgpfhsagagavylqagg 

v/geoekeavrkgsgssscsorgp\pppgmevcpl 

lgfwaicp 


3411 


A 


1040 


887 


ASLSKPAGISTMPWALILLFLLTHSAVS WQAGL 

tqppsvskdlr\qtatltctgnsnnvghqgviwl 
qqhqghppkllsyrnnnrpsgiserlsayksgna 
asltryglqtehead**crprrkllpktarlffffl 
idneeyllrvy 


3412 


A 


164 


83 


rrgipgsaslsltmcvrscfqsprlqwvwrtafl 

khtqrrhqgshrwthlggstyravifdmggvli 

pspgrvaaewevqnripsgtilkalmeggengp 

wmrfmraeitaegflrefgrlcsemlktsvpvd 

sffslltservakqepvmteaitqirakglqtavl 

snnfylpnqksflpldrkqfdvivescmegickp 
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SEQ 10 n 
NO: 


fc 
I 
1 
C 
t 
i 
1 


'redicted t* 
beginning n 
ludeotide 1 
ocation c 
corresponding t 
0 first amino s 
icid residue of | 
[>eptide s 


redlcted end A 
ucleotide ] 
ocation i 
orresponding ^ 
0 last amino 
icid residue of ? 
leptide V 
sequence 


,mlno acid sequence (A=Alanine C^Cysteine, l^^Pa™ aco, 
>Glutamic Acid, F=Phenylalaiiine, G-Wycine, a lusooinc, 
=Isoleucine,K=Lysine,L=Leucine,M=Methionine, 
<=Asparagine, P=Proline, Q=Glutaniine, R=ArgiBine, S=Senne, 
r=Threoiilne, V=Vallne, W=Tryptophaii, Y=Tyrosine, 
C=UDknown, •=Stop coaon, /-possible nucleotide deletwn, 
spossible oodeotide insertioa 








J 


DPRIYKLCLEQLGLQPSESll'-LDDLGTNLlUiAAKl. 

GIHTIKVNDPETAVKELEALLGFTLRVGVPNTRP 

VKKTMEIPKDSLQKYLKDLLGIQTTGPLELLQFD 

HGOSNPTYYIRLANRDLVLRKKPPGTLLPSAHAI 

EREFRIMKALANAGVPVFNVLDLCEDSSVIGTPF 

YVMEYCPGLIYKDPSLPGLEPSHRRAIYTAMNTV 

LCKIHSVDLQAVGLEDYGKQGSTTWV/YSSRRA 

RGALLFLDWELSYPWGDPFADVGYSCLAHYLPS 

SFPVLRGINDCDLTQLGIPAAEEYFRMYCLQMGL 

PPTENWNFYMAFSFFRVAAILQGVYKRSLTGQA 

SSTYAEQTGKLTEFVSNLAWDFAVKEGFRVFKE 

MPFTNPLTRSYHTWARPQSQWCPTGSRSYSSVPE 

ASPAHTSRGGLVISPESLSPPVRELYHRL^^ 

QRVYPAEPELQSHQASAARWSPSPLffiDLKVKQP 

W*GGRSGRTSWRLLALGCHT 

PESRHQCFSURSSHFLTMEMEQEKMTMNKliLSP 

DAAAYCCSACHGDETWSYNHPIRQRAKSRSLSA 

SPALGSTKEFRRTRSLHGPCPVTTFGPKACVLQN 

POTIMHIQDPASQRLTWNKSPKSVLVIKKMRDAS 

LLQPFKELCTHLMEENMIVYVEKKVLEDPAIASD 

ESFGAVKKKFCTFREDYDDISNQmniCLGGDGT 

LLYASSLFQGSVPPVMAFHLGSLGFLIPFSFENFQ 

SOVTQVIEGNAAVVL/RGSRLKVRWKELRGKK 

TAVHNGLGEKGSQAAGLDMDVGKQAMQYQVL 

NEWlDRGPSSYLSNVDVYLDGHLnTVQGD/G* 

GPOHLSWGP*AFLGRE*RLRLSLSGVIVSTPTGST 

AYAAAAGASMlifftrVPAIMrrPICPHSLSFRPIVV 

PAGVELKIMLSPEARNTAWVSFDGRKRQEIRHG 

DSISITTSCYPLPSICVRDPVSDWFESLAQCLHWN 

VRKKOAHFEEEEEEEEEG 


[l4l3 






1573 


3414 


A 


20 


2602 


-VnVNIQ<VNWINYlYYNQQgRAFHELKJilU.MSAi. 

ALGLPDLnSPFTFYESEREKMAVGVLTQTVGPW 

PRPVAYLSKQLDGVSKGWPPCLRALAATALLAQ 

EADKLTLGQNLNIKAPHAWTLMNTKGHHWLT 

NARLTKYQSLPCENPffllTEVCNTLNPTTLLPVSE 

SPGEHNCVEVLDSVYSSRPDLRDQPWASSVDWE 

LYMDGSSFINSQGERCAGYAVVTLDAVIKABCLW 

LOGTSAQKAELIALTOAVELSEGQESLEELLGRY 

FYVSHLPAFAKAVAQLCITCRQHNARQSPTVSPH 

lOAYGAAPFEDLQVDFTEMPKCGGNKYLLVLTC 

TYSGWVEAYPTRTEKAYEVTRVLLRDLIPRFGLP 

LRIGSHNGPVFVADLDCVEINVDTGVIWATWIKN 

EKDPVQLQKGKSGPSCTKGQCNPLELVITNPLDP 

RWKKGERVTLGINGAGLNPRVNILVRGEVYKCS 

LEPWQTFYDELNVPrmFPGKTRNLFLQLAEHV 

AOSLTVTSCYVCGGTVIADQWPWEARELVPTDP 

VPDEFPAQKNHPDNFWVLKASnRQYYIARVBKD 

FTLPVGRLHGG/RSNHTEKNPFSKFPKLQTV*AHP 

ESHRDWTAPTGLYWICGHRAYTKLP\ASSCVIGTI 

Z~^T^ r oTVTftBT I riiJPVYA5?R\KSIAIKN*NNDK 
KPSFFLLSUvluliULiVjrrv i AoivMwimi>j^ 

WPPERnQYYGPAT*AQDGSWGYRJPIYMINRIIRL 
OAVLKnTATGRALXmAQQETQMRNAIYQNRLA 
LDYLLAAEGEVCRKFNLTNCCLHIDNQGQVVED 
IVRDMTKVAHVPVQVWHGFDPGAMFRKWFPAL 
GGFKTLnRVUVIGTVLLLPRLLPVLLQMIKSFIAT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

Dudeotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A==AlanineC!=Cysteine, I>=Aspartic Acid, 
C^Glutaroic Acid, F=Plicnylalanine, G=Glycine, H=Histidine, 
I^lsoleucine, K»Lysine, L^Leuclne, M«^Metfaionine, 
N^Asparagine, P=Proline, Q=^lntamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X>=Unknown, *=^top codon, A^ssible nucleotide deletion, 
\=possible nucleotide insertion 










LVYQNASAQVYYINHY 


3415 


A 


455 


108 


NMSWRGRSTYRPRPRRSLQPPELIGAMLEPTDEE 
PKEEKPPTKSRNPTPDQKREDDSG/SAA*DFKWP 
EPGKPIFQGAMVRPKTGG/CGCEGGY*CQGEDS\P 
KAEHFKMPEAGEGKSQV 


3416 


A 


1 


874 


FFFFQRJNFIEHSGSVSLLALACDLGWCEDWSCC 
LVQGGGDLVDWQTNHGEDEAGGDTDSVDEAR 

EHEETRTKQAALDGEPLGGGQLTAVHLHPSKEQ 

QGQEGGERQRGARTHHWRGWEKGRRVRLRPPS 

GKLRADQPVRKLGGPTPS/TELPGLQPHAPTPHT 

A/PATPTYSPAPDTPNPPVRWKCPLPVEPRTRQLC 

RERTRKACPPKPRPPLGLPGDPTGPVTHHAPPVS 

PTGASGQERRAEPGAVSYAHASATK 


jHi / 


A 


JA5 


54/ 


AEDNKSSVDSSGQAAHPSKGKFFPHGTHWGTQC 

RGfflSVLGWQCSCPSTGCRVGLGLAMCQTHAYI 

HTHTHTHTHTPTDYGAHHTDPLQRWGLGPR\KS 

EAGPLPQLSRDQSHPGPLSPGASPRSAGLPGWHP 

AHQEPRARGRCARDGLSLQTRLTNKYDIQCCQE 

MRK 


3418 


A 


4073 


1000 

i . Z<V '. 


LDEYEARLTLANLDDFEEDNEDDDENRVNQEEK 

AAKITELINKLNFLDEAEKDLATVNSNPFDDPDA 

AELNPFGDPDSEEPITETASPRKTEDSFYNNSYNP 

FKEVQTPQYLNPFDEPEAFVTIKDSPPQSTKRKNI 

RPVDMSKYLYADSSKTEEEELDESNPFYEPKSTP 

PPhM.VNPVQELETERRVKRKAPAPPVLSPKTG ' 

LNENTVSAGKDLSTSPKPSPIPSPVLGRKPNASQS 

LLWCKEVTKNYRGVKITNFTTSWRNGLSFCAI 

LHHFRPDLroYKSLNPQDIKENNKKAYDGFASIGI 

SRLLEPSDMVLLAIPDKLTVMTYLYQIRAHFSGQ 

ELNWQIEENSSKSTYKVGNYETDTNSSVDQEKF 

YAELSDLKREPELQQPISGAVDFLSQDDSVFVND 

SGVGESESEHQTPDDHLSPSTASPYCRRTKSDTEP 

QKSQQSSGRTSGSDDPGICSNTDSTQAQVLLGKK 

RLLKAETLELSDLYVSDKKKDMSPPFICEETDEQ 

KLQTLDIGSNLEKEKLENSRSLECRSDPESPIKKT 

SLSPTSKLGYSYSRDLDLAKKKHASLRQTESDPD 

ADRTTLNHADHSSKIVQHRLLSRQEELKERARVL 

LEQARRDAALKAGNKHNTNTATPFCNRQLSDQ 

QDEERRRQLRERARQLIAEARSGVKMSELPSYGE 

MAAEKLKERSKASGDENDNIEIDTNEEIPEGFVV 

GGGDELTNLENDLDTPEQNSKLVDLKLKKLLEV 

QPQVANSPSSAAQKAVTESSEQDMKSGTEDLRT 

ERLQKTTERFRNPWFSKDSTVRKTQLQSFSQYI 

EDEVLNKGFXDS\SQYVVGELAALElSfEQKQIDTR 

AALVEKRLRYLMDTGRNTEEEEAMMQEWFML 

VNKKNALIRRMNOLSLLEKEHDLERRYELLNRE 

LRAMLAIEDWQKTEAQKRREQLLLDELVALVN 

KRDALVRDLDAQEKQAEEEDEHLERTLEQNKG 

KMAKKEEKCVLQ 


3419 


A 


4073 


1000 


LDEYEARLTLANLDDFEEDNEDDDENRVNQEEK 
AAKITELINKLOTIJDEAEBa^IATVNSNPFDDPDA 
AELNPFGDPDSEEPITETASPRKTEDSFYNNSYNP 
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"Amino add sequence (A=Alamne C=Cysteine, D=Aspartic Acio, 
SteSd. F=Phe»yI»Ianine. G=Glycine. H=Histid.ne, 
l=Isoleucine. K=Lysine, I^Uucine, M=Methionme, 
N=Asparagine, P=ProliBe, Q=GIutamiiie, R=Argiiiiiie, S-Senoe, 
T=ThreoniDe, V=Va]i]ie, W=Tryptophaii, Y=iyr(i»iM^ 
X=Unki>owii, *=Stop codoii,/=posslble nudeotde ddehon, 
V°possible nudeotide insertion 



FKEVQTPQYLN PFDEPEAl'VilKDSPPQSTKKJU^ 



RPVDMSKYLYADSSKTEEEELDESNPFYEPKSTP 
PPNNLVNPVQELETERRVKRKAPAPPVLSPKTGV 
LNENTVSAGKDLSTSPKPSPIPSPVLGRKPNASQS 
LLWCKEVTKNYRGVKITNFTTSWRNGLSFCAI 
LHHFRPDLIDYKSLNPQDIKENNKKAYDGFASIGI 
SW.LEPSDMVLlAffDKLTVMmYQIRAHFSGQ 
ELNWQEENSSKSTYKVGNYETDTNSSVDQEKF 
YAELSDLKREPELQQPISGAVDFLSQDDSVFVNI) 
SGVGESESEHQTPDDHLSPSTASPYCRRTKSDTEP 
OKSOQSSGRTSGSDDPGICSNTDSTQAQVLLGKK 
RLLK^ELSDLYVSDKKKDMSPPFICEETOEQ 
KLOTLDIGSNLEKEKLENSRSLECRSDPESPKKT 
SLSPTSKLGYSYSRDLDLAKKKHASLRQTESDPD 
ADRTTLNHADHSSKIVQHRLLSRQEELKERARVL 
LEOARRDAALKAGNKHNTNTATPFCNRQLSDQ 
OOEERRRQLRERARQLIAEARSGVKMSELPSYGE 
MAAEKLKERSKASGDENDNIEIDTNEEIPEGFW 
GGGDELTNLENDLDTPEQNSKLVDLKLm.LEy 
OPOVANSPSSAAQKAVTESSEQDMKSGTEDLRT 
ERLOKTTERFRNPVVFSKDSTVRKTQLQSFSQYI 

ISSemkrqrsiqedtkkgneekaaitetqrk^ 

EDEVLNKGra)S\SQYWGELAALENEQKQIDTR 

aalvekrlrylmdtgrnteeeeammqewfml 

VNKKNALIRRMNQLSLLEKEHDLERRYELLNRE 

lramlaiedwqkteaqkrreqllldelvalvn 

^xS^VRDXDAQEKQAEEEDEHLERTLEQNKG , 

KMAKKEEKCVLQ 

EI^GPNYSHRLLHHPl'FYmH KKHHiiW lAi-i^ 

VISLYAHPIEHAVSNMLPVIVGPLVMGSHLSSITM 

wfslaliittishcgyhlpflpspefhdyhi^kfn 

QCYGVLGVLDHLHGTDTMFKQTKAYERHVLLL 
GFTPLSESIPDSPK 



LLTPCDGRlPU RPSVGABSUSUFQQRRRKKKUrE 



EPEKTELSERELAVAVAVSQENDEENEERWVGP 
LPVEATLAKKRKVLEFERVYLDNLPSASMYERS 
^SiwrHWCTKTDFIITASHDGHVKFWKKIE 
EGIEFVKHFRSHLGVIESIAVSSEGALFCSVGDDK 
AMKVFDWNFDMINMLKLGYFPGQCEWIYCPG 
DA^SVAASEKSTGOFIYDGRGDNQPLHIFDra^ 
TSPLTOIRLNPVYKAWSSDKSGMffiYWTGPPHE 
WPlSwNWEYKTDTDLYEFAKCKAYPTSVCFS 
PDGKKIATIGSDRKVRIFRFVTGKLMRVFDESLS 
MFllLQQMRQQLPDMEFGRRMAVERELEKyDA 

>SriWvtoetghfvlygtmlgk 

RILGKQENIRVMQLALFQGIAKKHUA™^^ 
SvLQNIQADFnVCTSFKKNBFYMFTKREPE 

dtSadsdrdvfnekpskeevmaatqaegpkrv 

SDSAIIH-reMGDIHTKLFPVECPKTVENFCVHSRN 
GYYNGHTFHRIIKGFMIQTGDFrGTGMGGESIWG 

gefedefhstlrhdrpytlsmanagsntogsqef 

I^^lvPlPWIJDNKHTWGRVTKGMEWQR^^ 

VNPKTDKPYEDVSmOTVK 

-FvTVCAPLTWAGAKHRRlVlAASKKPPKVKVisriQ 
^rni p>it.T^TTEPNEVTHSGDTGVETOGRMPPKVT 
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location 
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Amino acid sequence (A»='Alanine C'^^'Cysteine, ]>=A5partic Add, 
E=«lutamic Add, F^Phcnylalaninc, G^lycine, H^Histidine, 
I=Isoieudne, K=Lysine, I>Leudne, M=Mcthionine, 
N^Asparagine, P-Proline, Q=Glntamine, R^Arginine, S^Serine, 
T=Threoninc V=Valine, W=Tryptophan, Y«Tyrosine, 
X=Unknown, *=Stop codon, A=possible nndcotide deletion, 
V^possible nudeotide insertion 










SELLRQLRQAMRNSEYVTEPIQAYnPSGDAHQSE 

YIAPCDCRRAFVSGFDGSAGTAnXEEHAAMWTD 

GRYFLQAAKQMDSNWTLMKMGLKDTPTQEDW 

LVSVLPEGSRVGVDPLEPTDYWKKMAKVLRSA 

GHHLIPVKENLVDKIWTDRPERPCKPLLTLGLDY 

TGISWKDKVADLRLKMAERNVMWFWTALDEI 

AWLFNLRGSDVEHNPWFSYAHGLETIMLFIDGD 

RIDAPSVKEHLLLDLGLEAEYRIQVHPYKSILSEL 

KALCADLSPREKVWVSDKASYAVSEUPKDHRC 

CMPYTPICIAKAVXnKNSANESEGMRRAHIKDAVAL 

CELFNWLEKEVPKGGVTEISAADKAEEFRRQQA 

DFVDLSFPnSSTGPNGAimYAPVPETNRTLSLDE 

V YLIDSO Avj y J\X)ul J. D V 1 K 1 MW^ 

FTYVLKGHIAVSAAVFPTGTKGHLLDSFARSAL 

WDSGLDYLHGTGHGVGSFLNVHEGPCGISYKTF 

SDEPLEAGMIVTDEPGYYEDGAFGIRIENVVLW 

PVKTKYNFNmGSLTFEPLTLVPIQTKMIDVDSL 

TDKECDWLNNYHLTCRDVIGKELQKQGRQEAL 

EWLIRETQPISKQH 


3423 


A 


5515 


934 


FKMPENPATDKLQVLQVLDRLKMKLQEKGDTS 

QNEKLSMFYETLKSPLFNQILTLQQSIKQLKGQL 

NHIPSDCSANFDFSRKGLLVFTDGSITNGNVHRPS 

NNSTVSGLFPWTPKLGNEDFNSVIQQMAQGRQIE 

YIDIERPSTGGLGFSWALRSQNLGKVDIFVKDV 

QPGSVADRDQRLKENDQBLAINHTPLDQNISHQQ 

AIALLQQPEGSLRLrVAREPVHTKSSTSSSLNDTT 

LPEtvewsSvEEVfiLy^ 

WRTIVPGGLADRDGRLQTGDHILKIGGTNVQG 

MTSEQVAQVLRNCGNSVRMLVARDPAGDISVTP 

PAPAALPVALPTVASKGPGSDSSLFETYNVELVR 

KDGQSLGIRIVGYVGTSHTGEASGIYVKSIIPGSA 

AYHNGfflQVNDKIVAVDGVNIQGFANHDWEVL 

RNAGQVVHLTLVRRKTSSSTSPLEPPSDRGTWE 

PLKPPALFLTGAVETETNVDGEDEEDCERIDTLKN 

DNIQALEKLEKVPDSPENELKSRWENLLGPDYEV 

MVATLDTQIADDAELQKYSKLLPIHTLRLGVEV 

DSFDGHHYISSIVSGGPVDTLGLLQPEDELLEVN 

GMQLYGBCSRREAVSFLKEVPPPFTLVCCRRLFDD 

EASVDEPRRTETSLPETEVDHNMDVNTEEDDDG 

ELALWSPEVKIVELVKDCKGLGFSILDYQDPLDP 

TRSVIVIRSLVADGVAERSGGLLPGDRLVSVNEY 

CLDNTSLAEAVEILKAVPPGLVHLGICKPLVEDN 

EEESCYILHSSSNEDKTEFSGTIHDINSSLILEAPK 

GFRDEPYFKEELVDEPFLDLGKSFHSQQKEIEQS 

KEAWEMHEFLTPRLQEMDEEREMLVDEEYELY 

QDPSPSMELYPLSfflQEATPVPSVNELHFGTQWL 

HDNEPSESQEARTGRTVYSQEAQPYGYCPENVM 

KENFVMESLPSVPSTEGNSQQGRFDDLENLNSLA 

PLYQHQATRVISKASAYTGMLSSRYATDTCELPE 

REEGEGEETPNFSHWGPPRIVEIFREPNVSLGISIV 

GGQTVIKRLKNGEELKGIFIKQVLEDSPAGKTNA 

LKTGDKILEVSGVDLQNASHSEAVEAIKNAGNP 

WFIVQSLSSTPRVPNVHNKANKITGNQNQDTQ 

EKKEKRQGTAPPPMKLPPPYKALTDDSDENEEE 
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Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 



Predicted end 
nucleotide 
locatiott 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



3424 



3425 



2223 



1162 



1162 



3426 



3427 



755 



52 



Amino acid sequence (A-^M anine ^^^^^^^"^nfH-^S 
E^Glutamic Acid, F-Phenylalanine, G=^lya«^^ 

V^^ossible nucleotide insertion 



aLSQSGRSHQNV^SAIIKTAPSKVKLVFIR 

otdSmavtpfpvpssspssiedqsgtopissee 

^StSvGKipESESFKLAVSQMKQQKYPpCY 
SssSS^ASSYHSTDADFTGYGGFQAPLSVD 

SSls?HEEAITALRQTPQKy^VVp^EA 
T^nKRNn.EIFPVDLQKKAGRGLGLSIV^ 



nrtakovasrvqkyfkltkagipvpgripnlt;! 

SJ^SSTCRRQHPLN]an.FmG™^^ 
SDSraSHMNTAVEDASDDESIPIMYW^ 

KSrcG]^IQG\VRW\HCR\DCPP\EMSL\DFaDS 
cSS^nnCGDHQLEPnRSVETTLDRDYCV 



mWvKLVFDKGLPARPKSPLDPKKDGESLSYS 
MLPLSDGPEGSSSRPQMIRGRLCDDTKPETFNQL 

^S^Seqllkyppeevesrrwqki^^^^^ 
StSvasrvqkyfikltkagipvpgrtpnlyi 

^SSsRRQIffLNKHLFKPVGTFNmHEPWY 
J^DDRSCmHMOTAVEDASDDESIPIkmN 

kcdncgiepiqg\ww\hcrvix:ppvemsl\df^^^ 

SSmTMMHKGDHQLEPIYRSXETFLDRDYCV 

S^Saaphcshsrqwrcsqtkmq 

So^ATENAKEEVRRILGLLDAYLKTRlTLVG 

^SS^lwlykqvlepsfrqafpnto 

SSSSQFRA\VFGEVra.CEKMAQF\DAKK 

fSS?^kgsreekqkpqaerke^ 

A^/i^MDECEQALAAEPKAKDPFAHLPKS 
^mDBreRKYSNEDTLSVALPYFWEH^ 

SwSSeeltqtfmscnlitgmfqrldklr 

SlsmFGTWSSSISGVWWRGQELAFPLSP 
^Q^^■SY^WRKLDPGSEETQTLVREYFSWE 

r.AFOHVGKAFNQGKIFK 



TAARRRQKGTAARRKCjkO 1 AARRRQKGTAA^ 
NVDEN 
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Amino acid sequence (A^AIanine C'=Cysteine, D^Aspartic Acid* 
£=Glutamic Acid, F=PhenylaIanIne, G=Glycjne, H-Histidine, 
I=IsoIeucine, K=Lysine, Lr=Lcudnc, M=Methionine, 
N=Asparagine, P=ProJine, Q==Glutaraine, R=Argimne, S=Serine, 
T=Tlireonine, V=Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, A=possible nucleotide deletion, 
V=possible nucleotide insertion 


3428 


A 


4 


1939 


LPLSLSFSEMPLPLLPMDLKGEPGPPGKPGPWGP 

PGPPGFPGKPGHGKPGLHGQPGPAGPPGFSRMG 

KAGPPGLPGNVGPPGQPGLRGEPGIRGDQGLRGP 

PGPPGLPGPSGITLPGKPGAQGVPGPPGFQGEPGP 

QGEPGPPGDRGLKGDNGVGQPGLPGAPGQGGAP 

GPPGLPGPAGLGKPGLDGLPGAPGDKGESGPPG 

VPGPRGEPGAVGPKGPPGVDGVGVPGAAGLPGP 

QGPSGAKGEPGTRGPPGLIGPTGYGMPGLPGPKG 

DRGPAGVPGLLGDRGEPGEDGEPGEQGPQGLGG 

PPGLPGSAGLPGRRGPPGPKGEAGPGGPPGVPGI 

RGDQGPSGLAGKPGVPGERGLPGAHGPPGPTGP 

KGEPGFTGRPGGPGVAGALGQKGDLGLPGQPGL 

RGPSGIPGLQGPAGPIGPQGLPGLKGEPGLPGPPG 

JbOKAVjUrO 1 AvjrXKvjr'r O V r OorvJl 1 VjrrsjxLrKjrr 

GAPGAFDETGIAGLHLPNGGVEGAVLGKGGKPQ 

FGLGELSAHATPAFTAVLTSPLPASGMPVKFDRT 

LYNGHSGYNPATGIFTCPVGGVYYFAYHVHVKG 

Tm^WVALYK>4NVPATYTYDEYKKGYLDQASG 

GAVLQLRPNDQVWVQMPSDQANGLYSTEYIHSS 

FSGFLLCPT 


3429 


A 


212 


1075 


EGLTGPCERVPFLLGRGPPHGATRAGHRRAVRW 

AGPESLPPLPRSLIMDSPRAGTHQGPLDAETEVG 

AT^or^xcxA vr\T7/^px>r^\/iM^\/nvr^APT GPni DA'N^r^^ 
AUKC 1 o 1 A I v^ill^ivrv^ V n,\^ V Oivl^Ai^i^aJrOJLJr/VJVlvj 

GPGPGPCEDPAGAGGAGAGGSEPLVTVTVQCAF 

TVALRARRGADLSSLRALLGQALPHQUQLGQLS 

YLAPGEDGHWVPIPEEESLQRAWQDAAACPRGL 

QLQCRGAGGRPVtYQWAQHSYSAQGPEDLGF 

RQGDTVDVLCEVDQAWLEGHCDGRIGIFPKCFV 

VPAGPRMSGAPGRLPRSQQGDQP 


3430 


A 


799 


1989 


INKYIMRKKIKLLSPLPPLWSHLALLQAS ATKWV 
LTPAAFAGKLLSVFRQPLSSLWRSLVPLFCWLRA 
TFWLLATKRRKQQLVLRGPDETKEEEEDPPLPTT 
PTSVNYHFTRQCNYKCGFCFHTAKTSFVLPLEEA 
KRGLLLLK\EAG\LEKINFSGG\EPFLQDRGEYLGK 
T vpppi<rvPT PT PQVQTW^Jxm^sT TPPPWPnxrvrtvp 

L. Vj^TL/rw. Vlil-fl\JLrro V oiw oiNUrol-»livDiv wrv^iN i \j\c, 

YLDILA1SCDSFDEEVNCP\IGRGN\GKKNHVENL 

QKL\RRWCRDYRVPFKINSVINPFmEEDMTEQI 

KALNPVRWKVFQCLLIEGENCGEDA\LREAERFV 

IGDEEFERFLERHKEVSCLVPESNQKMKDSYLIL 

DEYMRFLNCRKGRKDPSKSILDVGVEEAIKFSGF 

DEKMFLKRGGKYIWSICADLKLDW 


3431 


A 


5468 


2146 


ACGFLPGRCHFSTFKQCQEWLSRLSRATARPAKP 

EDLFAFAYHAWCLGLTEEDQHTHLCQPGEHIRC 

RQEAELARMGFDLQNVWRVSHINSNYKLCPSYP 

QKLLVPVWTTDKELENVASFRSWKRIPWVYRH 

LRNGAAIARCSQPEISWWGWRNADDEYLVTSIA 

KACALDPGTRATGGSLSTGNNDTSEACDADFDS 

SLTACSGVESTAAPQKLLILDARSYTAAVANRAK 

GGGCECEEYYPNCEWFMGMANIHAIRNSFOYL 

RAVCSQMPDPSNWLSALESTKWLQHLSVMLKA 

AVLVANTVDREGRPVLVHCSDGWDRTPQIVALA 

KILLDPYYRTLEGFQVLVESDWLDFGHKFGDRC 

GHQENVEDQNEQCPVFLQWLDSVHQLLKQFPCL 

FEFNEAFLVKLVQHTYSCLYGTFLANNPOEREK 

RNmC/RGTCSVWALLRAGNKNFHNF^ 
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3432 



36 



1873 



3433 



1481 



476 



PCT/USOl/04098 



3434 



1720 



1243 



N-Aspai^ne, ^ronneo^ Y=Tyro5tae, 
V=possible nucleotide insertion 



^^^^ 




PIATASS 



KES 
LST 



IfSEDTCKSGpW^ 
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Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»A!anine C'=Cysteine, D^Aspartic Add, 
£=K?lutamic Add, F^Phenylalanine, G==Glydne» H»fiistidiDe> 
I^Isoleudne, K^Lysine, ]>Leucine, M»Methionine, 
N=Asparagine, P=Prolinc, Q=GIutamine, R^Ai^inine, S=Serine, 
T=»Threonine, V=VaIine, W-Tryptophan, Y«Tyrosine, 
X-Uolcnown, *<=6top codon, /"^possible nndeotide deletion, 
\=pos5ibIe nucleotide insertion 










VRMTKSFLPURRAKGRVVNISSMUjRMANPAR 

SPYCITKFGVEAFSDCLRYEMYPLGVKVSWEPG 

NFIAATSLYSPESIQAIAKKMWEELPEWRKDYG 

KKYFDEKIAKMETYCSSGSTDTSPVIDAVTHALT 

ATTPYTRYHPMDYYWWLRMQIMTHIJPGAISDM 

lYJR 


3435 


A 


842 


3595 


ENQQQMLVAKEQRLHFLKQQERRQQQSISENEK 

LQKLKERVEAQENKLKKIRAMRGQVDYSKIMN 

GNLSAEIERFSAMFQEKKQEVQTAILRVDQLSQQ 

LEDLKKGKLNGFQSYNGKLTGPAAVELKRLYQE 

LQIRNQLNQEQNSKLQQQKELLNKKNMEVAMM 

DKRISELRERLYGKKIQACEKVFLNRVNGTSSPQ 

SPLSTSGRVAAVGPYIQVPSAGSFPVLGDPKPQS 

LSIASNAAHGRSKSANDGNWPTLKQNSSSSVKP 

VQVAGADWKDPSVEGSVKQGTVSSQPVPFSALG 

PTEKPGIEIGKVPPPIPGVGKQLPPSYGTYPSPTPL 

GPGSTSSLERRKEGSLPRPSAGLPSRQRPTLLPAT 

GSTPQPGSSQQIQQRISVPPSPTYPPAGPPAFPAGD 

SKPELPLTVAIRPFLADKGSRPQSPRKGPQTVNSS 

SIYSMYLQQATPPKNYQPAAHSALNKSVKAVYG 

KPVLPSGSTSPSPLPFLHGSLSTGTPQPQPPSESTE 

KEPEQDGPAAPADGSTVESLPRPLSPTKLTPIVHS 

PLRYQSDADLEALRRKLANAPRPLKKRSSITEPE 

GPGGPNIQKLLYQRFNTLAGGMEGTPFYQPSPSQ 

DFMVTLADyDNGNTNANGNLEELPPAQPTAPLP 

AEPAPSSDANDl^m 

EDNNNNVATVPTTEQIPSPVAEAPSPGEEQVPPA 

■DT T>D A OTITD'D A TCTTvTl/'D'nvTf VT/'DXTQCDT/rilJ/^T T>'\/'D 

r Lr r AorlrrA 1 o 1 IN ISJv i JLisJSJrJN bcK 1 uriOLK V K 
FNPLALLLDASLEGEFDLVQRinrVEDPSKPNDE 
GITPLHNAVCAGHHHIVKFLLDFGVNVNAADSD 
GWTPLHCAASCNSVHLCKQLVESGAAIFASTISD 
lETAADKCEEMEEGYIQCSQFLYGVQEKLGVMN 
KGVAYALWDYEAQNSDELSFHEGDALTILRRKD 
E 


3436 


A 


3 


2604 


GSTHASEKMKTGRSALWTDTGDMSVLNSPRHQ 

SCIMHVDMDCFFVSVGIRNRPDLKGKPVAVTSN 

RGTGRAPLRPGANPQLEWQYYQNKILKGKADIP 

DSSLWENPDSAQANGIDSVLSRAEIASCSYEARQ 

LGIKNGMFFGHAKQLCPNLQAVPYDFHAYKEVA 

QTLYETLAS\YTHNIEAVSCDEALVDITEILAETK 

LTPDEFANAVRMEIKDQTKCAASVGIGSNILLAR 

MATRKAKPDGQYHLKPEEVDDFIRGQLVTNLPG 

VGHSMESKLASLGDCTCGDLQYMTMAKLQKEF 

GPKTGQMLYRFCRGLDDRPVRTEKERKSVSAEI 

NYGIRFTQPKEAEAFLLSLSEEIQRKLEATGMKG 

KRLTLKIMVRKPGAPVETAKFGGHGICDNIARTV 

TLIXJATDNAKHGKAMLNMFHTMKLNISDMRGV 

VJJXl Vi^V^i-f V A 1 l^i^i^IrO 1 Olvir O V V^OonFx OVJO X O V 

RDVFQVQKAKKSTEEEHKEVFRAAVDLEISSASR 

TCTFLPPFPAHLPTSPDTNKAESSGKWNGLHTPV 

SVQSRLNLSmVPSPSQLDQSVLEALPPDLREQVE 

QVCAVQQAESHGDKKKEPVNGCNTGILPQPVGT 

VLLQIPEPQESNSDAGINLIALPAFSQVDPEVFAA 

LPAELQRELKAAYDQRQRQGENSTHQQSASASV 
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32 



4038 



469 



2602 



N=Asparagine, ^ohne, ^ Y=Tyrosine, 



X=Unknowii, *«=-Stop 
\«possible nucleotide insertion 



IIISJ^PSSSSSSSSSGFAEPLEIRPSPPTSRGG 

S(^Sfslssfpdlmgelisdeapsipapt 

SSf^^A^smSlwSYPEGGVKVLITGPWTEA 
I??SSISsPFRGMSLLHLAAAQGYi^ 
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nucleotide 

location 

corresponding 

to first amino 

acid residue of 
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sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A^Alanine OCysteine, D=Aspartic Add, 
E>=Glutamic Acid, F'==PbenyIalaniQe, G=Glycine, H^Histidfne, 
I^lsoleudne, K-Lysine, Lr=Leucine, M=Methionine, 
N=Asparagine, P^ProIine, Q=Glutamine, R«Arginine, S=Serine» 
T=Threonine, V=Valine, W^Tryptophan, V=Tyrosinc, 
X=UnkDown, *==Stop codon, /=^ossible nudeotide ddetlon, 
\=possible nudeotide insertion 










DLDENRVWESDI\KJDLKEHLGEQEVSISLIVDSVEE 

GDLGNYSCYVENGNGRRHASVLLHKRELMYTV 

ELAGGLGAILLLLVCLVTIYKCYKffilMLFYRNHF 

GAEELDGDNKDYDAYLSYTKVDPDQWNQETGE 

EERFALEILPDMLEKHYGYKLFIPDRDLIPTGTYI 

"DFW/ A ■Dr'\/T\OCVDT TT\r\iirrDXTV\7A7T51?n\17C!T17'CT tTT 

cUVAKl^VJJv^bJsJNJ-rUVMlJrJN Y V Vl%JK.vjWMriiLJDl 

RLR^nSdLVTGEIKVILIECSELRGIMNYQEVEALK 

HTIKLLTVIKWHGPKCmLNSKFWKM.QYE]V^^ 

KRffiPITHEQALDVSEQGPFGELQTVSAISMAAAT 

STALATAHPbLRSTFHNTYHSQMRQKHYYRSYE 

YDVPPTGTLPLTSIGNQHTYOSriPMTLINGQRPQT 

KSSREQNPDEAHTOSAILPLLPRETSISSVIW 


3439 


A 


251 


2037 


GPGNSSILIGGGHLFLIRSCLNLLLLNSKENTEHT 

MAKKVAVIGAGVSGLSSIKCCVDEDLEPTCFERS 

DDIGGLWKFTERGSSLSVMIWPLALSLLRHGGFC 

YSDFPFHEDYPNFMNHEKFWDYLQEFAEHFDLL 

KYIQFKTTVCGITKRPDFSETGQWDWTETEGKQ 

NRAVFDAVMVCTGHFLNPHLPLEAFPGIHKFKG 

QILHSQEYKIPEGFQGKRVLVIGLGNTGGDIAVEL 

SRTAAQVLLSTRTGTWVLGRSSDWGYPYNMMV 

TRRCCSFIAQVLPSRFLNWIQERKLNKRFNHEDY 

GLSITKGKKAKFIVNDELPNCILCGAITMKTSVIE 

FTETSAVFEDGTVEENIDVVIFTTGYTFSFPFFEEP 

T T/'OT On/'T/'TT?! "VTV /^'l 7T?T>T XTT "CD A T*! A TT/^T TOT V/^C 

LKoLCTKKlr LYKQ Vrr LNLisKA 1 LA11CjL1uJL.KCj5> 
ILSGTBLQARWVTRVFKGLCKRPASQKLMMEAT 
EKEQLIKRGVFKDTSKDKFDYIAYMDDIAACIGT 
KI^SIPLLFLKDPRIAWEVFFGPCTPYQYR\LMGPG 
KWDGARNAILTQWDRTLKPLKTRIVPDSSKAWP 
SM\SHYLKAWGAPVLLASLLLICK\SSLFLKLVRD 
KLQDRMSPYLVSLWRG 


3440 


A 


1 


3533 


IMPCGSSRLLRGCWTHPNEPVSDLSVFDCIESVM 

ENSKVLGESMAGISQNAKTGDLPAFGECVGIASK 

ALCGLTEAAAQAAYLVGIFDPNSQAGHQGLVDP 

IQFARANQAIQMACQNLVDPGSSPSQVLSAATIV 

AKHTSALCNACRIASSKTANPVAKRHFVQSAKE 

VANSTANL\n&TIKALDGDFSEDMlNKCRlATAPL 

lEAVENLTAFASNPEFVSIPAQISSEGSQAQEPILV 

SAKPMLESSSYLIRTARSLAINPKDPPTWSVLAG 

HSHTVSDSnCSLITSIRDKAPGQRECDYSIDGINRC 

IRDIEQASLAAVSQSLATRDDISVEALQEQLTSW 

QEIGHLIDPIATAARGEAAQLGHKGTQLASYFEP 

LILAAVGVASKILDHQQQMTVLDQTKTLAESAL 

QMLYAAKEGGGOTKAQHTHDATTEAAQLMKEA 

VDDIMVTLNEAASEVGLVGGMVDAIAEAMSKL 

DEGTPPEPKGTFVDYQTTWKYSKAIAVTAQEM 

MTKSVTNPEELGGLASQMTSDYGHLAFQGQMA 

AATAEPEEIGFQIRTRVQDLGHGCIFLVQKAG\AL 

QVCPTDSYTKRELIECARAVTEKVSLVLSALQAG 

AENSETFADHRENE.KTAKALVEDTBCLLVSGAAS 

TPDKLAQAAQSSAATTTQLAEWKLGAASLGSD 

DPETQVVLINAIKDVAKALSDLISATKGAASKPV 

DDPSMYQLKGAAKVMVTNVTSLLKTVKAVEDE 

ATRGTRALEAHECIKQELTVFQSKDVPEKTSSPE 

ESIRMTKGITMATAKAVAAGNSCRQEDVIATAN 
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edicted end Ai 
deotide E= 
cation I- 
rresponding N= 
last amino T 
ud residue of X 
tptidc 
,quence 


:pos$ible nudeotide insertion 


NO: 


etbod Fr 
be 

Dl 

lo 

CG 

to 
a( 
P 


edicted 1^ 
ginning 

icleotide lo 
cation 

irrespondlng to 
first amino a< 
:id residue of p< 
eptide si 
sauence 




SI 




i 

i 

1 
I 
< 
] 


SvSavtcliqaaeamkgtewvdpedptv^ 
SSuvasieaWleqlkprakpkqade-^^ 

VA^^QDSEAMRBLQAAGNAVKRASDNL 

VRAAOKA^^GKADDDDVWK-nCFVGGIA^ 

^;S^EARKKLAQIRQQQYKFLPm 

SoGVGVRGARAMATVqbKAAALNLSAI^s 


1 3441 


A 


3 


1584 


pI^PGFSVAQKPFGATYVWSSIINTLQTQVEy 

S^SShndcfvgseavdvifshliqnkyf 

^^S^VRVCQALMDYKVFEAVPIKVFG 
SS^sJ:SLWFrriPNQDSQLGKENKLY 

^S?SSfkssdirsasledlwenlslkp/^s 

SsmSPQVINEVWQEETIGRLLQl^LPLL 

dslSoeavpkipqpkrqstmvnssnyldrg^ 

SDSQEDEWLSAAroCI^YLPDQMW 
FsSSS^SKGKTOLLVLFLNMDHQKDWa 


3442 


A 




822 




3443 


A 


3 


1373 


^nVQGGGTVRLLLILSGCLVYGTAETDVNWJ^ 

oeSSqqfcytwlipQWHdiwtw^ 
wSrlvS^Qveneeklkeleqfsiwnffsse. 

Sro^^GLYSTXTCLKVEDEKDmYSyi 

SSotklflvfllglmlffcgdllsrsqifyys 

?S?SLXWlLSmffKKSPIYVILVG^^^^ 

SoSJmLLTEEEYRIQGEVETRKALEELR 
CTn^SAWKTVSRlQSPKRFADFVEGSSHLT 
S^^LGSnAQDElYEEASSEEEDSYS 

-- I^^OTESDSEKTrKKbNLOPRivlOPPLG 


3444 


A 


566 


1718 


SIg'Xntamkkkvix^^ 

^^!S^pp.ni1.RLSRPLECSCFRTSIW 
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peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystelnc, D=Aspartic Acid, 
E=Glutaraic Add, F=Phenylalanine, G=Glycine, H^Histidine, 
I=l5oleucine, K=Lysine, L«Lcucine, M=Methionine, 
N=Asparagine, P==Proline, Q=Glutamine, R^'Ai^nlnc, S=Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y^Tyrosinc, 
X-Unknown, *=Stop codon, /-possible nucleotide deletion, 
V=possible nucleotide insertion 










DETLYKAWSSIVYQLIPNVQQLEMNLKNFAEIIE 

ADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI 

KQFKLSCSKLAASFQSMEVRNSNFAAFIDIFTSN 

TYVMVVMSDPSIPSAATLmiRNARKHFEKLERV 

DGPKQCLLMR 


3445 


A 


566 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLG 

EPG\GSLGWVLPNTAMKKKVLLMGKSGSGKTS 

MRSnFANYIARDTRRLGATILDRIHSLQINSSLST 

YSLVDSVGNTKTTDVEHSHVRFLGNLVLNLWDC 

GGQDTFMEhmrSQRDMFRl^VLIYVFDVESR 

FT FKDMHYYOSri FATT ONSPDAKIFCT VHKMD 

LVQEDQRDLIFKEREEDLRRLSRPLECSCFRTSIW 

DETLYKAWSSIVYQLIPNVQQLEMNLRNFAEIIE 

ADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI 

IKQFKLSCSKLAASFQSMEVRNSNFAAFIDIFTSN 

TYVMVVMSDPSIPSAATLINIRNARKHFEKLERV 

DGPKQCLLMR 


3446 


A 


566 


1718 

*. ... .2* 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLG 

EPG\GSLGWVLPNTAMKKKVLLMGKSGSGKTS 

MRSIIFANYL\RDTRRLGATILDRIHSLQINSSLST 

YSLVDSVGNTKTFDVEHSHVRFLGNLVLNLWDC 

GGQDTFMEhmrSQRDNIFRNVEVLnrVFDVESR 

FT FTCDMHYYO^sPT FATT ONSPnATCTFrT VHKMD 

LVQEDQRDLEFKEREEDLRRLSRPLECSCFRTSIW 

DETLYKAWSSIVYQLIPNVQQLEMNLRNFAEIIE 

ADEYLLEERATELyiSHyQCKEQRDAHRFEKISNI 

ii£QFk|;s;esiCtAAs^^ 

TYVMWMSDPSIPSAATLINIRNARKHFEKLERV 
DGPKQCLLMR 


3447 


A 


1 


2930 


VLLGPLWDKLSTADPIPVIVTMASKRKSTTPCMIP 

VKTVVLQDASMEAQPAETLPEGPQQDLPPEASA 

ASSEAAQNPSSTDGSTLANGHRSTLDGYLYSCK 

YCDFRSHDMTQFVGHl^SEfrrDFNKDPTFVCSG 

CSFLAKTPEGLSLHNATCHSGEASFVWNVAKPD 

NHVVVEQSIPESTSTPDLAGEPSAEGADGQAEinT 

KTPEMKMKGKAEAKKIHTLKENVPSQPVGEALP 

KLSTGEMEVREGDHSFINGAVPVRQASASSAKN 

PHAANGPLIGTVPVLPAGL\QFLSLQQQPPVHAQ 

HHVHQPLPTAKALPKVMIPLSSIPTYSAAMDSNS 

FLKNSFHKFPYPTKAELCYLTVVTKYPEEQLKIW 

FTAQRLKQGISWSPEEmDARKKMFNTVIQSVPQ 

PTITVLNTPLVASAGNVQHLIQAALPGHWGQPE 

GTGGGLLVTQPLMANGLQATSSPLPLTVTSVPK 

QPGVAPINTVCSNTTSAVKWNAAQSLLTACPSI 

TSQAFLDASIYKNKKSHEQLSALKGSFCRNQFPG 

QSEVEHLTKVTGLSTREVRKWFSDRRYHCRNLK 

GSRAMIPGDHRSinDSVPEVSFSPSSKVPEVTCIPT 

TATLATHPSAKRQSWHQTPDFTPTKYKERAPEQ 

LRALESSFA0OTLPLDEEIJ3RLRSETKMTRREIDS 

WFSERRKKVNAEETKKAEENASQEEEEAAEDEG 

GEEDLASELRVSGENGSLEMPSSMLAERKVSPIK 

INLKNLRVTEANGRNEIPGLGACDPEDDESNKLA 

EQLPGKVSCKKTAQQRHLLRQLFVQTQWPSNQD 

YDSIMAQTGLPRPEWRWFGDSRYALKNGQLK 

WYEDYKRGNFPPGLLVIAPGNRELLQDYYMTHK 
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peptide I sequence 

sequence 



"3448 TA 



1324 



"3449 [A 



2389 



3450 



201 



1705 



Amino add sequence (A^AIamne C=C ysteine. ^^ff"'^;;-"^' 
StamicAcid,F=PhenyIalanine,0=GIyrine,H=Histid.ne, 

l=Isolendnc,K=Lysine,Ir=Leudne,M-Methiomne, 

Wraglie, P=ProIlne, Q=Glutamine, R=Arginine, S^^enne. 

T=Thrconine, V^VaUne, W=Tryptophan, Y«Tyrosii»e. 

x4jnl^o..n;*«Stop codon,^ssible nudeotide deletion, 

V=:possible nudeotide insertion 



MLYEEDLQNLCUKTQMSSQQVKQWtAhkMuHE 
TOAVADTGSEDQGPGTGELTAVHKGMGDTYSE 
v<;p>JSF.SWEP RVPEASSEPFD\TSSPQAGRQLETD 
FVARAEKGFKl'REAHLLQVAGVGTGLQNUASLS 



GLASGVMAQRAFPNPYADYNKSLAEGYFDAAG 

S^pSrltokireixqqmergi^^^ 

GYTGWAGIAVLYLHLYDVFGDPAYLQLAHGYV 

KQSLNCLTIalSImCGDAGPLAVAAVLYm^O^ 

NEKQAEIXJimmLNKTOPHAPNEMLYGRIGYry 

ALL^/NKNFGVEKIPQSfflQQICETILTSGENLAR 

KRNFTAKSPLMYEWYQEYYVGAAHGLAGIYYY 

^SS^klhslVkpsvdyvcqlkfpsgn 

YpSDNRDLLVHWCHGAPGVryi^lQAYKVF 

rSSylc\dayqcadviwqygllkkgyglcy\ 
SayafltlWqdmkylyrach'aewc 
leygehgcrtpdtpfslfegmagtiyflxadllfp 



tkarnfpafel 



srhvtgaarspsragpsdppamgdedddksuav 
eSteanltoheeicvsvenfellkvlgt^^^^^ 
vflvrkagghdagklyamkvlrkaalvqrak 

ivlalehlhklgiiyrdlklenvlldseghivltd 

?CT^FLraEKERTFSFCGTOYMAPEnRSKTGH 

gkavdwwslgillfelltgaspftlegerntqae 

VSRRILKCSPPFPPRIGPVAQDLLQRLLCKDPiaCR 

LGAGPQGAQEVRNHPFFQGLDWVALAARKIPAP 

mQmSELDVG\NFAEEFrRLEPVYSPPGQ\PPPG 

DPRffQGYSFVAPSILFDHNNAVMTDGLEAPGAQ 

DRPGRAAVARSAMMQDSPFFQQYELDLFEPALG 

OGSFSVCRRCRQRQSGQEFAVKILSRRLEANTQR 

EVAALRLCQSHPl^^LHEVHHDQLHTYLN^EL 

LRQGELLEfflRKKRHFSESEASQE,RSLVSAVSFM 

IffiEAGWHRDLKPENILYADDTPGAPVKIIDFG/F 

SPRLRPQSPGWMQTPSFTLQYAAPELLAQQGYp 

ESCDLWSLGmYMMLSGQAPFQGASGQGGQS 

OAAEMCKIREGRFSLDGEAWQGVSEEAKELVR 

QLLTVDPAKRLKLEGLRGSSWLQDGSARSSPPLR 

Sdvlessgpavrsglnatfmafnrgkregfflk 

SVENAPLAKRRKQKLRSATASRRGSPAPANPGR 
APVASKGAPRRANGPLPPS 



KGTEMNKSRWQSRRRHGRRSHQQNPWhKLlUJa 

eSrsSaqpahdsghgddespstssgta^tss 

VPELPGFYFDPEKKRYFRLLPGHNNOO'LTKESIR 

SsaSS^le^tosdri^d^^^^^ 
kygiinlqslktpilkvfmhenlyftnrk\ansv 

cwaslnhldshillclmglaetpgcatllpaslf 

$5^riiAG©RPG\MLCSFRffGAWSCAWSLNIQA 
NNCFSTGLSRRVLLTNVVTGHRQSFGTNSp\nLA 
QQFALMAPLLFNGCRSGEIFAIDLRCGNQGKGW 
STOLFHDSAVTSVRILQDEQYLMASDMAGm 

r^LRTIKCVRQYEGHVNEYAYlJl^Iffi^^^ 
T vAvnnnCYIRIWSLHDARLLRTIPSPYPASKAD 
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Predicted end 
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location 
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Amino acid sequence (A=Alanine C==Cy5teine, jD=*Aspartlc Add, 
E>=Glutamic Acid, F^Phenylalanine, G^lycine, H»HIstidine, 
I-Isoleucine, K-Lysine, L=Leucine» M=Methionine, 
N-Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serinc, 
T=Threoninc, V=Valine, W=Tryptophan, Y^Tyrosine, 
X-Unknown, *=Stop codon, A»pos$ible nucleotide deletion, 
V^possible nucleotide insertion 



IPSVAFSSRLGGSRGAPGLLMAVGQDLYCYSYS 



3451 



19 



6033 



LLSAMLSHGAGLALWITLSLLQTGLAEPERCNFT 

LAESKASSHSVSIQWRILGSPCNFSLIYSSDTLGA 

ALCPTFRIDNTTYGCNLQDLQAGTIYNFKnSLDE 

ERTVVLQTDPLPPARFGVSKEKTTSTGLHVWWT 

PSSGKVTSYEVQLFDENNQKIQGVQIQESTSWNE 

YTFFNLTAGSKYNIATTAVSGGKRSFSVYTNGST 

VPSPVKDIGISTKANSLLISWSHGSGNVERYRLM 

LMDKGILVHGGVVDKHATSYAFHGLSPGYLYNL 

TVMTEAAGLQNYRWKLVRTAPMEVSNLKVTND 

GSLTSLKVKWQRPPGVNVDSYNITLSHKGTIKESR 

VLAPWIIAETHFKELVPGRLYNQVTCSAVSLGELS 

AQKM\AVGRTFPDKVANLEANNNGRMRSLVVS 

WSPPAGDWEQYRILLFNDSVVLLNITVGKEETQ 

YVMDGTGLVPGRQYEVEVIVESGKLKNSERCQG 

RTWLAVLQLRVKHANETSLSIMWQTPVAEWEK 

YIISLADRDLLLIHKSLSKDAKEFTFTDLVPGRKY 

MATVTSISGDLKNSSSVKGRTVPAQVTDLHVAN 

QGMTSSLFTNWTQAQGDVEFYQVLLIHENVVIK 

NESISSETSRYSFHSLKSGSLYSVWTTVSGGISSR 

QVVVEGRTVPSSVSGVTVNNSGRNDYLSVSWLL 

APGDVDNYEVTLSHDGKVVQSLVIAKSVRECSF 

SSLTPGRLYTVTITTRSGKYENHSFSQERTVPDKV 

QGVSVSNSARSDYLRVSWVHATGDFDHYEVTIK 

NKNNFIQTK5IPKSENECVFVQLVPGRLYSVTVT 

TKSGQYEANEQGNGRTIPEPVKDLTLKNRSTEDL 

HVTWSGANGDXODQYEIQLLFNDMKVPPPFHLVN 

TATEYRFTSLTPGRQYKILVLTISGDVQQSAFIEG 

FTVPSAVKNIfflSPNGATDSLTVNWTPGGGDVDS 

YTVSAFRHSQKVDSQTEPKHVFEHTFHRLEAGEQ 

YQIMIASVSGSLKNQINVVGRTVPASVQGVIADN 

AYSSYSLIVSWQKAAGVAERYDILLLTENGILLR 

NTSEPATTKQHKFEDLTPGKKYKIQILTVSGGLFS 

KEAQIEGRTVPAAVTDLRITENSTRHLSFRWTAS 

EGELSWYNIFLYNPDGNLQERAQVDPLVQSFSFQ 

NLLQGRMYKMVIVTHSGELSNESFIFGRTVPASV 

SHLRGSNRNTTDSLWFNWSPASGDFDFYELELYN 

PNGTKKENWKDKDLTEWRFQGLVPGRKYVLW 

WTHSGDLSNKVTAESRTAPSPPSLMSFADIANT 

SLAITWKGPPDWTDYNDFELQWLPRDALTVFNP 

YNNRKSEGRIVYGLRPGRSYQFNVKTVSGDSWK 

TYSKPIFGSVRTKPDKIQNLHCRPQNSTAIACSWI 

PPDSDFDGYSIECRKMDTQEVEFSRKLEKEKSLL 

lOT^lMLWHKRYLVSKVQSAGMTSEVVEDSTIT 

MroRPPPPPPHmVNEKDVLISKSSI>OTVNCSWFS 

DTNGAVKYFTWVREADGSDELKPEQQHPLPSY 

LEYRHNASIRVYQTNYFASKCAENPNSNSKSFNI 

KLGAEMESLGGKCDPTQQKFCDGPLKPHTAYRI 

SmAFTQLFDEDLKEFTKPLYSDTFFSLPITTESEP 

LFGAffiGVSAGLFLIGMLVAWALLICRQKVSHG 

RERPSARLSIRRDRPLSVHLNLGQKGNRKTSCPIK 

INQFEGHFMKLQADSNYLLSKEYEELKDVGRNQ 

SCDIALLPENRGKNRYNNELPYDATRVKLSNVDD 

DPCSDYINASYIPGNNFRREYIVTQGPLPGTKDDF 

WKMVWEQNVHNIVMVTQCVEKGRVKCDHYW 
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.mino add sequence (A=Alaninc C^Cystcme, D=Asparnc Acio, 
WSIntamic Add, F=Phenylalanine, G=Glycine, n msnoiuc, 
=Isoleudne, K-Lysine. L=Uudne, M=MethiODme, 
l=Asparagiii£, I'=ProIine, Q=Glutamine, R=Argimne, S-Senne, 
-=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
C=tJnknown, *=Stop codon, /-possible nndtotide ddetoon, 
-possible Dudeotide insertion 








J 

] 


?ADODSLYYGDLE-QMLSESVLPEWnREF]iiCUJi 

EOLDAHFLIRHFHYTVWPDHGVPETTQSLIQFVR 

rVRDYINRSPGAGPTVVHCSAGVGRTGTFIALDR 

[LOQLDSKDSVDIYGAVNHDLRLHRVHMVQTEC 

Q\Ai™QCVM)VLRARKLRSEQENPLFPIYENV 

NPF.YHRDPVYSRH 


3452 


A 


63 


1073 


FFRSSSDNGSPlRQYE/HSTPAHQGPVMGLEtot^ 

ARNSQLRIVLVGKTGAGKSATGNSILGRKVFHSG 

TAAK5ITKKCEKRSSSWKETELVWDTPGIFDTE 

VTO^TSKEIIRCILLTSPGPHALLLWPLGRYTEE 

EHKATEKILKMFGERARSFMILIFTRKDDLGDTN 

LHDYLREAPEDIQDLMDIFGDRYCALNNKATGA 

EQEAQRAQLLGLIQRWRENKEGCYTOM^R 

AEEEIQKQTQAMQELHRVELEKEKARIREBYEEK 

IRKLEDKVEQEKRKKQMEKKLAEQEAHY AVRQ 

OR ARTEVESKDGILEUMTALOIASFILLRLFAED 


3453 


A 


2674 


514 


GPITFLKKJCAKMOJMPLRIHVLLCiLAi i il. v v^V 

DKKVDCPRLCTCEIRPWFTPRSIYIVIEASTVDCND 

LGLLTFPARLPANTQILLLQTWnAKIEYSTDFPV 

NLTGLDLSQNNLSSVTNINGKKMPQLLSVYUEBN 

KLTELPEKCLSELSNLQELYINHNLLSTISPGAFIG 

LHNLLRLHLNSNRLQMINSKWFDALPNLEILMIG 

ENPIIRIKDMNFKPLI>a.RSLVIAGINLTEIPDNAL 

VGLENLESISFYD-NRLIKVPHVALQKWNLI^ 

LNKNPINRIRRGDFSNMLHLKELGINNMPELISID 

SLAVDNLPDLRKIEATNNPRLSYIHPNAFFRLPKL 

ESLMLNSNALSALYHGTffiSLPNLKEISfflSNPIRC 

DCVIRWMNMNKTNIRFMEPDSLFCVDPPEFQGQ 

NVROVHFRDMMEICLPLIAPESFPSNLNVEAGSY 

VSFHCRATA\EPQPEIYWITPSGQKLLPNT\LTDKF 

YVHSEGTLDINGVTPKEGGLYTCIATNLVGADLK 

SVMIKVDGSFPQDNNGSLNIKIRDIQANSVLVSW 

KASSKILKSSVKWTAFVKTENSHAAQSARIPSDV 

KVYNLTHLNPSTEYMCroiPTIYQKNRKKCVNyT 

TKGLHPDQKEYEKNNTTTLMACLGGLLGnGVIC 

LISCLSPEMNCDGGHSYVRNYLQKPTFALGELYP 

PT ITJT.WEAGKEKSTSLKVJCATVIGLPTNMS 


3454 


A 


1844 


244 


" ERYLFATYVAPSATLDlGLOgEKKKElYMJUyw- 

Slfdtaeeyilllllepwtkmvksdqiaykcv 

ELVEETOQLDSTYFRKLQALHKETFSKKATOTTC 

eigtgilslsnvskrteywdnvpaeykhfkfsdl 

LNNKLEFEHFRQFLETHSSSMDLMCWTDEQFBJl 
rrYRDKNQRKAKSIYIKNKYLNKKYFFGPNSPAS 

lyoonqvmhlsggwqkilheqldapvlveiqk 
hvqnrlenvwlplflaseqfaarqkikvqmkdi 

AEELLLQKAEKKIGVWKPVESKWISSSCKn^RK 

allnpvtsrqfqrfvalkgdllengllfwqevq 

KYKDLCHSHCDESVIQKKITniNCFINSSIPPALQI 
DffVEQAQKHEHRKELGPYVFREAQMTFLGVNff 

TCFWPOFCEFRKJilj ll^t'^-L'-"^'^ v i-ixiravv^i-' J. i^A^Nc* 
KlAVL/QNDEKSGKDGKQYANTSWAKmi^^ 
DSFLGLQPYGRQPTWCYSKYIEALEQERILLKIQE 
PT ,F.K\SCLOACNLS0ILRLALQLCL 


3455 


A 


228 


3330 


- IfFfAQAMMSFGGADALLGAPFAPLHGGGiiUlX 
ATARKGGAGGTOSAAGSSSGFHSWTOTSVSSVS 
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Amino add sequence (A^AIaninc C^Cysteine, D=Aspartic Acid^ 
E^Glutamic Add» F^Phenylalanine, G=Glydne, H=Histidine, 
I=Isoleudne, K-Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q^Glutamine, R=Argtnine, S=Serinc, 
T«Threoninc, V«Valine, W=Tryptophaii, Y^Tyrosine, 
X=Unknown, *=Stop codoo, /=^ossible nudeotide deletion, 
possible nucleotide insertion 








'. - . ?* 


ASPSRPRGAGAASSTDSLDTLSNGPEGCMVAVA 

TSRSEKEQLQALNDKFAGYIDKVRQLEAHNRSLE 

GEAAALRQQQAGRSAMGELYEREVREMRGAVL 

RLGAARGQLRLEQEHLLEDIAHVRQRLDDEARQ 

REEAEAAARALARFAQEAEAARVDLQKKAQAL 

QEECGYLRRHHQEEVGELLGQIQGSGAAQAQM 

QAETRDALKCDVTSALREIRAQLEGHAVQSTLQ 

SEEWFRVRLDRLSEAAKVNTDAMRSAQEEITEY 

RRQLQARTTELEALKSTKDSLERQRSELEDRHQA 

DIASYQEAIQQLDAELRNTKWEMAAQLREYQDL 

LNVKMALDIEIAAYRKLLEGEECRIGFGPIPFSLF 

EGLPKIPSVSTHIKVKSEEKIKWEKSEKETVrVEE 

QTEETQVTEEVTEEEDKEAKEEEGKEEEGGEEEE 

AEGGEEETKSPPAEEAASPEKEAKSPVKEEAKSP 

AEAKSPEKEEAKSPAEVKSPEKAKSPAKEEAKSP 

PEVAKSPEBCDGKQNFQAEVKSPEKAKSPAKEEAK 

SPAEAKSPEKAKSPVKEEAKSPAEAKSPVKEEAK 

SPAEVKSPEKAKSPTKEEVAKSPEKAKSPEKAKSP 

EKEEAKSPEKAKSPVKAEAKSPEKAKSPVKAEA 

KSPEKAKSPVKEEAKSPEKAKSPVKEEAKSPEKA 

KSPVKEEAKTPEKAKSPVKEEAKSPEKAKSPEKA 

KTLDVKSPEAKTPAKEEARSPADKFPEKAKSPVK 

EEVkSPEKAKSPLKEDAKAPEKEIPKKEEVKSPV 

KEEEKPQEVKVKEPPKKAEEEKAPATPKTEEKK 

DSKKEEAPKKEAPKPKVEEKKEPAVEKPKESKV 

EAKKEEAEDKKKVPTPEKEAPAKVEVKEDAKPK 

EKTEEYAKKEPDDAKAfcEPSKPAEKKEAAPEKKD 

TKEEKAKKPEEKPKTEAKAKEDDKTLSKJEPSKP 

KAEKAEKSSSTDQKDSKPPEKATEDKAAKGK 


3456 


A 


258 


1463 


YLSFIPGflASKSAPMNGHCFAENGPSQKSSLPPLL 

IPPSENLGPHEEDQWCGFKKLTVNGVCASTPPL 

TPIKNSPSLFPCAPLCERGSRPLPPLPISEALSLDDT 

DCEVEFLTSSDTDFLLEDSTLSDFKYDVPG\RRSF 

RGCGQINYAYFDTPAVSAADLSYVSDQNGVGVP 

DPNPPPPQTHRRLRRSHSGPAGSFNKPAIRISNCCI 

HRASPNSDEDKPEVPPRVPPPRPVKPDYRRWSA 

EVTSSTYSDEDRPPKVPPREPLSPSNSRTPSPKSLP 

SYLNGVMPPTQSFAPDPKYVSSKALQRQNSEGS 

ASKVPCILPIIENGKKVSSTHYYLLPERPPYLDKY 

EKFFREAKKKNGGAQIQPLPADCGISSATEKPDS 

KTKMDLGGHVKRKHLSYVGTP 


3457 


A 


2 


4869 


FILSSSSSASSEHFHHHYSFGNWWPGSFKGHRMS 
LPFYQRCHQHYDLSYRNKDVRSTVSHYQREKKR 
SAVYTQGSTAYSSRSSAAHRRESEAFRRASASSS 
QQQASQHALSSEVSRKAASAYDYGSSHGLTDSS 
LLLDDYSSKLSPKPKRAKHSLLSGEEKENLPSDY 
MVPIFSGRQKHVSGITDTEEERIKEAAAYIAQRNL 
LASEEGITTPKQSTASKQTTASKQSTASKQSTASK 
1 AoKV2a 1 AoRQo V VbKyATSALQQEETSEKJCS 
RKWIRGKAERLSLRKTLEETETYHAKLNEDHLL 
HAPEFnORSHTVWEKENVKLHCSIAGWPEPRV 
TWYKNQVPINVHANPGKYIIESRYGMHTLEINAC 
DFEDTAQYRASAMNVKGELSAYASWVKRYKG 
EFDETOFHAGASTMPLSFGVTPYGYASRFEIHFD 
DKFDVSFGREGETMSLGCRWITPEIKHFQPEIQ 
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SEQID 1 M 
NO: 


ethod 1 P 
b 

1 ^ 
Lc 

c 

t< 

a 

F 


redicted P 
eginning « 
ucleotide 

»cation » 
orresponding t( 
> first amino a 
cid residue of P 
eptide s 
eauence 


'cdicted end A 
ucleotide ^ 
cation ^' 
[>rre5ponding ^ 
) last amino T 
cid residue of X 
eptide V 
eqoence 


=Glutaniie Acid, Jfe^henylatomne, G=Glycme, H-HisticJine, 

>^Ste. F=ProHne, Q=GIutan>me, R=Arginme, S=Ser.n*. 

.=Unknown>=«fop codon, /=posslble Budeot.de deletion, 
^possible nucleotide insertion 




Is 






^™GWLSPSKWVQTLWSGERATLTFSHLN1U^ 
JEGLYimVRMGEYYEQYSAYVFVRDADAEIEG 

SaapldvkcleankdyiuswkqpAvdggspil . 

TYFIDKCEVGroSWSQCNDTPVKFARFPVTGLIE 
sSS^WmoWsRVSEPVAM-DP^K 
ARLKS/PPLSTLDWnVWTEEEPSEGIVPGPPTDLS 

^SvyLswkppgqrghegimyfvekcea 
gtenworvntelpvksprfalfdlaegksycfe. 
?S5.gvgepseateviwgdkldipkapgki 

TPsSniyrSVVVSWEESKDAKELVGYYffiANVA 

tfsfrdsmvlgwkqpdkiggaeitgyyvnyrev 

OTQVAAMNMAGLGAPSAVSECFKCEEWmW 
GSK^EVRKDSL^a.QWW>PVHSGRTmG 

yfSlkeakakedqwrglneaaiknvylkvrg 
T kegvsyvfrvrainqagvgkpsdlagpvvaet 

iSSvVVWDDDGVISLNFECDKMTPKSEFS 

wskdyvstedsprleveskgnktkmifkdlgm 

DDLGIYSCDVTDTDGIASSYLIDEEELKiaLALSH 
EHKFPTW\^SELAVEILEKGQVRRWMQA£KLS 

gS^vnyifnekqifegpkykmhidrntgiiemf 

SSoDEDEGTYTPQLQDGKATNHSTVVLVGD 

GECNVLLKCKVANKKETOIVWYKDEREISVDE 
;Kffl)FKDQICTl.LITEFSKKDAGIYEVILKraDRGK 
DKSRLKLVDEAFKELMMEVCKKIALSATDLKIQ 
STAEGIOLYSFVTYYVEDLKVNWSHNGSAIRYSD 

VSTADSGKYGLWKNKYGSETSDFTVSVFIPEEE 
/(jPMA A' KSl.KGGKKAK 


3458 


A 


1 3963 


827 


-TSR^SDFnroraOTVMSTATSP^^^^^^ 
TTPGrreTVmSTSSVTSSSNVATATTVLSVGQS 
LShm^TTSLTSTSSESDTGQEAEYSLYDFLDSCRA 
STLLAELDDDEDLPEPDEEDDENEDDNQEDQEY 

AGAGSRPIGEQEEEEYETKGGRRRTWDDDYVLK 

rqfSSrpgrtnvqqttdleipppgtphs 

mLffiVECTPSPRLALTLKVTGLGTTREVELPLTO 

SotS^qkllqlscngnvksdklrriwepty 

SSiKO^SDI^KENGKMGCWSffiHVE^^^ 
T1Tm.O0ffiEn.ALASGALPDW 

J^SSlwtctafgasraivwlqnbreatve 
rS^dpgefrvgrlkhervkvprgesl 
JSSWqmadrksvi^veflgeegtgl^^^ 
Syalvaa^qrtolgawlcddnfpddkmv 

}S nrS m--^9i^«'^-«l^^^APFP0DSDELERI 
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SEQJD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine^ D=Aspartic Acid, 
EX^Iutamic Acid, F=Pli?nylalanine, G=Glycine, H^Histidine, 
I==IsoIeucine, K?=Lysinc, LHLcudne, M«Mcthionine, 
N=Asparagine, P=Proline, 0=Giutamine, R^Arginine, S=^rine, 
T^Threonine, V=VaIine, W=Tryptoplian, Y=Tyrosine, 
X=Un known, *=^top codon, /^possible nndeotide deletion, 
\=possible nucleotide insertion 










TKLFHFLGIFlJ^CIQD>mLVDLPISKPFFKm 
GDIKSNMSKLIYESRGDRDLHCTESQSEASTEEG 
HDSLSVGSFEEDSKSEFILDPPKPKPPAWFNGILT 
WEDFELVNPHRARFLKEIKDLAIKRRQILSNKGL 

CPSSRIYGFTAVDLKPSGEDEMITMDNAEEYVDL 
MFDFCMHTGiQKQMEAFRDGFNKVFPMEKLSSF 
SHEEVQMILCGNQSPSWAAEDIINYTEPKLGYTR 
DSPGFLRFVRVLCGMSSDERKAFLQFTTGCSTLP 
PGGLANLHPRLTVVRKVDATDASYPSVNTCVHY 
LKLPEYSSEEIMRERLLAATMEKGFHLN 


3459 


A 


88 


603 


SCGPRGLASLGLGFSGRCDDQNKGRS\DGPEAQA 

EACSGERTYQELLVNQNPIAQPLASRRLTRKLYK 

CIKKA\^QKQIRRG\^VQKFVNKGEKGIMVLA 

GDTLPffiVYOHLPVMCEDKNLPYVYIPSKTDLGA 

AAGSKRPTCVIMVKPHEEYQEAYDECLEEVQSL 

PLPL 


3460 


■A 


139 


1997 


QVTNMSDKSELKAELERKKQRLAQIREEKKRKE 

EERKKKETDQKKEAVAPVQEESDLEKKRREAEA 

LLQSMGLTPESPIVPPPMSPSSKSVSTPSEAGSQD 

SGDGAVGSRRGPIKLGMAKITQVDFPPREIVTYT 

KETQTPVMAQPKEDEEEDDDWAPKPPIEPEEEK 

TLKKDEEN\DSKAPPHELTEEEKQQILHSEEFLSFF 

DHSTRIVERALSEQINIFFDYSGPtDF/ENDKEGEIQ 

AGAKLSLNRQFF\DER\WSKASGWVSCLDWSSQ 

YP\ELLVASYNNNEDAPHEPDGVALVWNMKYK 

KTTPEYWHCQSAVMSATFAKFHPNLVVGGTYS 

GQIVLWDNRSNKRTPVQRTPLSAAAHTHPVYCV 

NVVGTQNAHNLISISTDGKICSWSLDMLSHPQDS 

MPT VPTT^n^l^AVAVT^M^FPVOnVKriSlTiVVn^PP 
ivjLCi^ V riJ\>v^oJS>/\ V /w 1 oivior r v kju v inint v v uoe>jd 

GSVYTACRHGSKAGISEMFEGHQGPITGIHCHAA 

VGAVDFSHLYVTSSFDWTVKLWTIKNNKPLYSF 

EDNAGYVYDVMWSPTHPALFACVDGMGRLDL 

WNLNNDTEVPTASISVEGNPALNRVRWTHSGRE 

lAVGDSEGQIVIYDVGEQIAVPKNDEWARFGRTL 

AEINANRADAEEEAATRIPA 


3461 


A 


139 


1997 


QVTNMSDKSELKAELERKKQRLAQIREEKKRKE 

EERKKKETDQKKEAVAPVQEESDLEKKRREAEA 

LLQSMGLTPESPIVPPPMSPSSKSVSTPSEAGSQD 

SGDGAVGSRRGPKLGMAKJTQVDFPPRErVTYT 

KEtQTPVMAQPKEDEEEDDDWAPKPPIEPEEEK 

TLKKDEEN\DSKAPPHELTEEEKQQILHSEEFLSFF 

DHSTRIVERALSEQINIFFDYSGRDF/ENDKEGEIQ 

AGAKLSLNRQFFUDERWSKASGWVSCLDWSSQ 

YP\ELLVASYNNNEDAPHEPDGVALVWNMKYK 

KTTPEYVFHCQSAVMSATFAKFHPNLWGGTYS 

GQIVLWDNRSNKRTPVQRTPLSAAAHTHPVYCV 

NWGTQNAHNLISISTDGKICSWSLDMLSHPQDS 

MELVHKQSKAVAVTSMSFPVGDVNNFVVGSEE 

GSVYTACRHGSKAGISEMFEGHQGPITGmCHAA 

VGAVDFSHLYVTSSFDWTVKLWTTKNNKPLYSF 

EDNAGYVYDVMWSPTHPALFACVDGMGRLDL 

WNLNNDTEVPTASISVEGNPALNRVRWTHSGRE 

lAVGDSEGQIVIYDVGEQIAVPRNDEWARFGRTL 

AEINANRADAEEEAATRIPA 
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Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
1 add residue of 
peptide 
seqnence 



"Predicted end 

I nucleotide 

: location 

I corresponding 
to last amino 
acid residue of 

I peptide 
sequence 



3462 



2643 



34.63 



198 



146 



PCT/USOl/04098 

Amino acid sequence (A^Alanine C^ysteiu e, P=^spartic ^^^^ 

E-Glutamic Acid, F^Phenylalanine, G^GIycme, H=Histidme, 

Msoleucinc, K=Lysine, L-Leudne, M=Metliionme, 

N-Asparagine, P=Proiine, Q=Glutamine, R^Ai^lnme, S=Serine, 

™eom^e,V=VaIine.W==Tryptoph^^^ 

X=Unknown, *=Stop codon, y^-possiWe nudeotide deletion, 

\=possible nudeotide insertion 



TAPEFSRSTHASAHASVARVLKNREIAQLlfJUil^R 

vmqrmtiWeadmeklikkreelfllqea^ 

SLQAESPEEEKGlXJELAEEffiVLAANTOYMD 
GITDCOATIVQLEETKEELDSTDTSVVISSCSLAE 
ARLLLDNFLKASIDKGLQVAQKEAQmLLEGMJ 

mGsSpSATO-rePLTRWCSYDRGQPmSTO 
VGFTPPSSPPTRPRNDRNVFSRLTSNQSQGSALD 
KSDDSDSSL\SEVLRGIISPVGGAKGARTAPL^ 
SMAEGHTKPILCLDATDELLFTGSKDRSCKMWN 

SttgqXIkghpi^sikycshsglvfsvst 

SYKVWDIRDSAKCIRTLTSSGQVISGDACAA^T 
RAITSAQGEHQINQIALSPSGTMLYAASGNAVRI 

vra^SWFOPVGKLTCHIGPVMCLTVT^^^ 
wSmrVKMFELGECVTCTIGFraNF^PH 
YDGIECLMQGDILFSGSRDNGKK^QQmjQ 
QIPNAHKDWVCALAFIPGRPI^L^ACRAGVK^ 
WNVDNFTPIGEIKGHDSPINAICTOAKHIFTASSG 

PP^^^nyxTv^n^nt TPCLPRRVLAIKGRATTLP 

SGEPRPEPGNM ATCIGEKlEUl^KVGNLLUKUb^A 

GfvRAESIHrGLEVAlKMroKKAM^NJ^Q^^ 
VONEVKIHCQLKHPSILELYNYFEDSNYVYLVLE 

LDHPFMSRNSSTKSKDLGTVEDSlDS(aHAIlSTAI 
TASSSTSISGSLFDKRRLLIGQPLPNKMTVFPKNK 
SSTDFSSSGDGNSFYTQWGNQETSNSGRGRVIQD 

PRCHSAEMLSVSKRSGGGENEERYSPTDNNAMF 

SkSsgsferpdnnqalsnhlcpgktpfp 

FADFTPQTErVQQWFGNLQmAHLRKTIEYDSIS 

pSqghpdlqkdtsknawtdtkvkknsdas 

SS^QQNTMKYMTAIJHSKPEnQQECVFGS 

Sp^eqskSgmeppwgyqnrtlrsi^plvah^ 

LKPIRQKTKKAWSILDSHEVCVELVKEYASQEY 

^viSssdgn-iitiyypngg\rgfpla\drppsp 

SsBXYSFVDNLPEKYWKKYQYASBPVQLVRS 

Skityftryakcilmenspgadfevwfydgv 
SSqviektgksytlksesevnslk^ 

SvT^HANEGHWCLALESnSEEERKTRSAK^^ 

SuSdolpksaqllksvfvknvgwatqnlts 
gav^Sdgsqlvvqagvssisytspngc^ttr 

xvn^.gT.PnYlKOKL(X:i^SILI^SNPTPNFH 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amioo 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
seqaence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Add, 
E^lutamic Add, F=Pfaenylalanine, G^GIycine, H^Histidine, 
I-Isoleudne, K=Lysine, L==Leudne, M^'Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y^Tyrosinc, 
X=UnknowD, *=Stop codon, A=^ssible nudeotide ddetion, 
V^ssible nucleotide insertion 


3464 


A 


14 


348 


AVRTVSGTSLGPRSHSRSPGRCHCFSAVTFSSPRL 
AASEAPDPMEEWDVPQMKKEVESLKYQLAFQR 
E1^SKTIPELLKWIEDGPKI>PFLNPDLMKNNPW 
V\EKGKCTIL 


3465 


A 

( 


5537 


405 


VRKLDRERVGAWWRGAWARHPRQEAGEHAKR 

RKGHAETPRGRRKGRAGRSAAAVGELRPARRSL 

ETSRAAAAMAKDSPSPLGASPKKPGCSSPAAAV 

LENQRRELEKLRAELEAERAGWRAERRRFAARE 

RQLREEAERERRQLADRLRSKWEAQRSRELRQL 

QEEMQREREAEIRQLLRWKEAEQRQLQQLLHRE 

RDGWRQARELQRQLAEELVNRGHCSRPGASEV 

SAAQCRCRLQEVLAQLRWQTDGEQAARIRYLQ 

AALEVERQLFLKYILAHFRGHPALSGSPDPQAVH 

SLEEPLPQTSSGSCHAPKPACQLGSLDSLSAEVG 

VRSRSLGLVSSACSSSPDGLLSTHASSLDCFAPAC 

SRSLDSTRSLPKASKSEERPSSPDTSTPGSRRLSPP 

PSPLPPPPPPSAHRKLSNPRGGEGSESQPCEVLTPS 

PPGLGHHELIKLNWLLAKALWVLARRCYTLQEE 

NKQLRRAGCPYQADEKVKRLKVKRAELTGLAR 

RLADRARELQETNLRAVSAPIPGESCAGLELCQV 

FARQRARDLSEQASAPLAKDKQIEELRQECHLLQ 

ARVASGPCSDLHTGRGGPCTQWLNVRDLDRLQ 

RESQREVLRLQRQLMLQQGNGGAWPEAGGQSA 

TCEEVRRQMLALERELDQRRRECQELGAQAAPA 

RRRGEEAETQLQAALLKNAWLAEENGRLQAKT 

DWVRKVEAENSEVRGHLGRACQERDASGLIAEQ 

LlfQQlAARGQ^ 

QALQeRPGHPPEQPWETSQMPESQVKGSRRPKF 

HARAEDYAVSQPNRDIQEKREASLEESPVALGES 

ASVPQVSETVPASQPLSKKTSSQSNSSSEGSMWA 

TVPSSPTLDRDTASEVDDLEPDSVSLALEMGGSA 

APAAPKLKIFMAQYNYNPFEGPNDHPEGELPLTA 

GDYIYIFGDMDEDGFYEGELEDGRRGLVPSNFVE 

QIPDSYIPGCLPAiCSPDLGPSQLPAGQDEALEEDS 

LLSGKAQGWDRGLCQMVRVGSKTEVATEILDT 

KTEACQLGLLQSMGKQGLSRPLLGTKGVLRMAP 

MQLHLQNVTATSANITWVYSSHRHPHVVYLDD 

REHALTPAGVSCYTFQGLCPGTHYRARVEVRLP 

RDLLQVYWGTMSSTVTFDTLLAGPPYPPLDVLV 

ERHASPGVLWSWLPVTIDSAGSSNGVQVTGYA 

VYADGLKVCEVADATAGSTLLEFSQLQVPLTWQ 

KVSVRTMSLCGESLDSVPAQIPEDFFMCHRWPET 

PPFSYTCGDPSTYRVTFPVCPQKLSLAPPSAKASP 

HNPGSCGEPQAKFLEAFFEEPPRRQSPVSNLGSE 

GECPSSGAGSQAQELAEAWEGCRKDLLFQKSPQ 

NHRPPSVSDQTGEKENCYQHMGTSKSPAPGFIHL 

RTECGPRKEPCQEKAALERVLRQKQDAQGFTPP 

QLGASQQYASDFHNVLKEEQEALCLDLWGTERR 

PSAKVDCMPRGGPQQLGTGANTPARVFVALSDY 

NPLVMSANLKAAEEELVFQKRQLLRVWGSQDT 

HDFYLSECNRQVGNIPGRLVAEMEVGTEQTDRR 

WRSPAQGHLPSVAHLEDFQGLTIPQGSSLVLQGN 

SKRLPLWTPKIMIAALDYDPGDGQMGGQGKGRL 

ALRAGDVVMVY\GPMDDQGFYYGELGGHRG\L 
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Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 



Predicted ena 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino »cid sequence (A=Alan.pe C^Cys teine, "=^spar.« ^»u. 

"Sa"Kp=P«>«»^ Q=Glutamine, R=Ar0nine. S=Ser.ne, 
?=&Le, V=ValiDe, W=Tryptophan, V=Tyros.n^ 
X=U»known;*=Stop codon, ^possible nudeoMe ddeton, 
V=possible nudeotide insertiOB 



1111 



WANLRKMSSQGH 



JSSwGEPKEKGQLYNLPAElPCPTLTPPTPP 
^GPWGNIFFLE'ISDRTOPNFLFMCSVESAAR'ra 

SS^lWlfrdtpuujwyaavqgrwpyll 
pvSdasrialmwkfggiyldtofivlki^rnlt 

diSngwiwghqgpqlltrvfkkwcsirslaesr 

SvSpEAFYPiPWQDWKKYFEDINPEELP 

rlSaiyavhvwnkksqgtrfeatsrallaqlh 

SvfST^IffiNVLVKGPAGHLPNLLLMGHW 



2175 



MkviLKQSKQC]^LLlC KVAQVCPVC<i(X^^^ 



3468 



YFWW.SGLESR^^^ 

psvlrrtynpddyfrkfephlysldsnsddvdsl 

S&KYQLGMLHFSTQYDLLHNHLTVRVffiA 

SSppiSsrqdmahsnpyvkicllpdqkns 

{£LEAKQQW.VEGEMLFIPARAANLPVNNKPyM 
SsLvSpTmGLSFYDSRTTSLLHPARQIQLPNS^^ 
qSSiLSVALTLFSRSPLEQNIIQPLy^^^ 
CGSWNMPPGNSQPRGDFLYHSICTWVQDNYAQ 

pSmfnittohlsklfaq^^ 

^jSvKARMILQKYHLSIHEVA^GF^^^ 

rRVFRROFGMDYVDILQIHRWDYNTPIEETLEAL 

Sw^SSlGASSMHASQFAQALELQKQH 

AELETPYKPHPWGFK 

ALPLPLPTLYPGMSRRKQRKJ^Cji^L ISPCEUreA^^ 



ngdaseWqvcakccaqftdpteflahqna^ 

S•SCTVMVIIGGQENPNNSSASSEP1U>EGHN^^Q 

?SSsNPPDSGSSVPTDPTWGPBRRGEESSGH 

ELvSrOTAAGGGGGULASPKLGATPLPPESW 

APPPPPPPPPPPGVGSGHLNIPLILEELRVLQQRQI 

HS^SEQICRQVLLLGSLGQTVGAPASPSaP 

^^GVEFXPLVASTTALSATESLTLLSTSAGT 

aSglpafnkfvlmkavepknkadentppgse 
Sakgvaesstattjmqlsklvtslpswall^ 
Sctgsfhlplcaralgxaspsetsklqqlvekto 

^GTvAVmASGAPTTSAPAPSSSASSGPNQC^^^ 
raS^isCPRALRLHYGQHGGERPFKCKVCGRAF 
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SEQID 
NO: 


Method 


Predicted 

beginDing 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutam!c Acid, F=PhenylaIanine, G=G]ycine, H=Histidine, 
I=Isoieucine, K=Lysine, l>Leucinc, M^^Metbionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Scrinc, 
T=Threonine, V=VaIine, W«Tiyptophan, Y=Tyrosine, 
X=Unknown, *-Stop codon, /possible nucleotide deletion^ 
V=poss!b]e nucleotide insertion 










NAVTLQQHVRMHLGGQIPNGGTALPEGGGAAQ 

ENGSEQSTVSGAGSFPQQQSQQPSPEEELSEEEEE 

EDEEEEEDVTDEDSLAGRGSESGGEKAISVRGDS 

EEASGAEEEVGTVAAAATAGKEMDSNEKTTOOS 

SLPPPPPPDSLDQPQPMEQGSSGVLGGKEEGGKP 

ERSSSPASALTPEGEATSVTLVEELSLQEAMRKEP 

GESSSRKACEVCGQAFPSQAAL\EEH\QKTHPKEG 

PLF\TCVFCRQGFLERATLKKHMLLAHHQVQPFA 

PHGPQNIAALSLVPGCSPSITSTGLSPFPRKDDPTI 

P 


3469 


A 


3 


5664 


NLRPLSFALFLGDPNMANLEESFPRGGTRKIHKP 

EKAFQQSVEQDNLFDISTEEGSTKRKKSQKGPAK 

TKKLKIEKRESSKSAREKFEILSVESLCEGMRILG 

CVKEVl^IJELWSLPNGLQGFVQVTEICDAYTKK 

LNEQVTQEQPLKDLLHLPELFSPGMLVRCWSSL 

GITDRGKKSVKLSLNPKNVNRVLSAEALKPGML 

LTGTVSSLEDHGYLVDIGVDGTRAFLPLLKAQEY 

IRQKNKGAKLKVGQYLNCIVEKVKGNGGWSLS 

VGHSEVSTAIATEQQSWNLNNLLPGLVVKAQVQ 

KVTPFGLTLOTLTFFTGWDFMHLDPKKAGTYFS 

NQAVRACILCVHPRTRWHLSLRPIFLQPGRPLTR 

LSCQNLGAVLDDVPVQGFFKKAGATFRLKDGVL 

AYARLSHLSDSBOmWEAFKPGNTHKCRn 

QMDELALLSLRTSlIEAQYLRYHDIEPGAVyKGT 

VLTIKSYGMLVKVGEQMRGLVPPMHLADILMK 

NPEKKYHIGDEVKCRVLLCDPEAKKLMMTLKKT 

LIESKLPVFTCYADAKPGLQTHGFIiRVKD YGCIV 

KFraNVQGLVPKHELSTEYIPDPERVFYTGQVV 

KWVLNCEPSKERMLLSFKLSSDPEPKKEPAGHS 

QKKGKAINIGQLVDVKVLEKTKDGLEVAVLPHN 

IRAFLPTSHLSDHVANGPLLHHWLQAGDILHRVL 

CLSQSEGRVLLCRKPALVSTVEGGQDPKOTSEIH 

PGMLLIGFVKSIKDYGVFIQLPSGLSGLAPKAIMS 

DKFVTSTSDHFVEGQTVAAKVTNVDEEKQRMLL 

SLRLSDCGLGDLAITSLLLLNQCLEELQGVRSLM 

S>mDSVLIQTLAEMTPGMFLDLVVQEVLEDGSV 

VFSGGPVPDLVLKASRYHRAGQEVESGQKKKVV 

ILNVDLLKLEVHVSLHQ\DLV\NRKARKLRKGSE 

HQAIVQHLEKSFAIASLVETGHLAAFSLTSHLND 

TFRFDSEKLQVGQGVSLTLKTTEPGVTGLLLAVE 

GPAAKRTMRPTQKDSETVDEDEEVDPALTVGTI 

KKHTLSIGDMVTGTVKSIKPTHVVVTLEDGIIGCI 

HASfflLDDVPEGTSPTTKLKVGKTVTARVIGGRD 

MKTFKYLPISHPRFVRTEPELSVRPSELEDGHTAL 

NTHSVSPMEBCIKQYQAGQTVTCFLKKYNWKK 

WLEVEIAPDIRGRIPLLLTSLSFKVLKHPDKKFRV 

GQALRATWGPDSSKTFLCLSLTGPHKLEEGEVA 

MGRWKVTPNEGLTVSFPFGKIGTVSIFHMSDSY 

SETPLEDFVPQKWRCYILSTADNVLTLSLRSSRT 

NPETKSKVEDPEINSIQDKEGQLLRGYVGSIQPH 

GVFFRLGPSVVGLARYSHVSQHSPSKKALYNKH 

LPEGKLLTARVLRLNHQKNLVELSFLPGDTGKPD 

VLSASLEGQLTXQEERKTEAEERDQKGEKKNQK 

RNEKKNQKGQEEVEMPSKEKQQPQKPQAQKRG 

GRECRESGSEQERVSKKPKKAGLSEEDDSLVDV 
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SEQ ID 
NO: 



Method 



Predicted 
beginning 
nadeotide 
locatioD 
corresponding 
to first amino 
acid residae of 
peptide 



3470 A 



Predicted end 
nucleotide 
locatioD 
corresponding 
to last amino 
add residue of 
peptide 
sequence 



2334 



3471 



3472 



537 



PCT/USOl/04098 

Amino add sequence (A-Alaninc C==Cystdne, "^^Pa'T'^jA"^' 
St'mic Add,F«Phenylalanine, OGlydne, H=Hist.dine, 
l=IsoleudDC,K-Lysinc,L=Leudne,M-Methionine, 
K*P=P-line,QM5luto^^^^^ 
T=Thrcomne, V-Valinc. W=Tryptophan. Y=Ty|OSin% 
X-Unknown. *=Stop codon,AT)ossible nudeohde deletion, 
\=possiblc nucleotide insertion 



lhfREGKEEAHET^WLPKEKQ^KPAEAFIU.^2l.ssvi 



FAWNVGLDSLTPALPPLAESSDSEEDEKPHQAH 
KKSKKERELEKQKAEKELSRTEEALMDPGRQPE 
SADDFDRLVLSSPNSSILWLQYMAFHLQATEIEK 
ARAVAERALKTISFREEQEKLNVWVALLNLEmi 
YGSQESLTKVFERAVQYNEPLKVFLHLADIYAKS 
EKFQEAGELYNRMLKRTOQEKAVWIKYGAFLLR 
SOAAASHRVLQRALECLPSKEHVDVIAKFAQL 
SoLGDAERAKAIFENTLSTYPKRTOVWSVYID 
SnKHGSQKDVRDIFERVIHLSLAPKRMKFFFF^ 

ylSqhgtekdvqavkakaleyveakssvl 

ED 



1226 



148 



2272 



3473 



TAAAPVAPGTMDDATVLRKKUVlVUUvlLUkOSir 
kvKSAYSERLKFNVAVKIIAEKKTPTDFVERFL 
PI^MDILATVNHGSIIKTyEIFETSDGRIYIIMm.G 
VQG^WcQGALHEDVARKMFRQLSSAVKY 
6roSlVHIU)LKCENLLLDKDFNlKLSDFGFS^ 
cSijSNGRia^KTFCGSAAYAAPEVLQSff^^^ 
VYDIWSLGVILYIMVCGSMPYDDSDIW^MQK 
EHRVDFPRSKNLTCECKDLIYRMLQ\PDVS\KBLH 
BDEILSHSWLQPPKPKNATSSASFKREGEGKYRAE 

?kStktglrpdhi^dhklgaktqhrllvwe^ 

RNRMEDRLAETSRAKDHmSGAEVGKAST 



TERGAPQHFlLPmLi PSSVHTGQPKTiVSVlLFL 



PSCEEPQANKATLVCIJVlbm/FYPGE-MVT^^^ 
GTLITQSVEKITPSKQSNNKYVASSYLSLTPEQW 

P<;ppgvc;rnvMnEGStVEKSVAPAECS 



DKPTRHKTYLSSSWAlOl^AAEGPVUlXjliJ-wvi 

^nhwflrlreglknqspteaekpassslpss 

VVLLTSDNVmiYSLREPQTPTNVnLSEAEEESLy 

lnkgraytaslgetavafdfgplaavpkilfgq 

NGKDEVVAYPLYILYENGETFLTYISLLHSPGN/I 

wkavgsiahas\aaednygydacavlclpcvpn 

n^miESGMLYHCVVLEGEEEDDHTSEKSWDSR 
mLffSLYVFECVELELALKLASGEDDPFDSDFSC 
i^SKCPSRYHC™EAGVHSV.^TWmL 
WGSDEEDKDSLQELSTEQK^VBHnODT^LP 

Slsycreebkslremaerladkyeeakek 

OEDIMNRMKKLLHSFHSELPVLSDSERDMKMEL 

SdqlrmSnaikq^ 

SLPKPTIILSAYQRKCIQSILKEEGEHIREMVKQIN 
DIKNHVNF 



2272 



DKFrRHK-nLSSSWAKMAAAKUPVGDGLL\VQT 
tSJHVWLBLREGLKNQSPTEAEKPASSSLPSS 

SSlltrnvvfglggelflwdgedssflvvrlr 
Sggeepalsqyqkllcinpplfeiyqvllspt 

^gg^Wr,T^.F.TPKRWGKNSEFEGGKST 
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SEQID 
NO: 


Method 


Predicted 

beginniog 

nucleotide 

location 

correspooding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresjponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A'^Alanine C-Cysteine, D^Aspartic Add, 
£>=G!utamic Acid, F'=Pbenylatanine, G^Glycine, H-Histidine, 
I»I$oleudne, K-Lysinc, I>Leucine, M«Mcthionine, 
N^Asparagine, P=Proline, Q=Glutamine, R=Argin!ne, S=Serine, 
T^Thrconine, V-Valine, W=Tryptophan, Y=Tyrosinc 
X=Unknown, *^top codon» /^possible nudeotide deletion, 
\ppossible nudeotide insertion 










VNCSTTPVAERPFTSSTSLTLKHAAWYPSEILDPH 
VVLLTSDhrmiYSLREPQTPTmaiLSEAEEESLV 
LNKGRAYTASLGETAVAFDFGPLAAVPKTLFGQ 
NGKDEWAYPLYILYENGETFLTYISLLHSPGN/I 
WKAVGSIAHAS\AAEDNYGYDACAVLCLPCVPN 
BLVIATESGMLYHCWLEGEEEDDHTSEKSWDSR 
. IDLIPSL YVFEC VELELALKLASGEDDPFDSDFSC 
PVKLHRDPKCPSRYHCTPffiAGVHSVGLTWIHKL 
HKFLGSDEEDKDSLQELSTEQKCFVEHILCTKPLP 
CRQPAPIRGFWIVPDILGPTMICITSTYECLIWPLL 
STVHPASPPLLCTREDVEVAESPLRVLAETPDSFE 
KHIRSILORS V Al^AFLKASEKDIAPPPEECLOLLS 
RATQVFREQYILKQDLAKEEIQRRVKLLCDQKK 
KQLEDLSYCREERKSLREMAERLADKYEEAKEK 
QEDIMNRMKKLLHSFHSELPVLSDSERDMKKEL 
QLPDQLRHLGNAIKQVTMKKDYQQQKMEKVL 
SLPKPTIILSAYQRKCIQSILKEEGEHIREMVKQIN 
DIRNHVNF 


3474 


A 


4344 


2550 


DRRREPERHVRVKQRTSVLNMLRRLDKIRFRGH 
KRDDFLDLAESPNASDTECSDEIPLKVPRTSPRDS 
EELRDPAGPGTLIMATGVQDFNRTEFDRLNEIKG 
HLEIALLEKHFLQEELRKLREETNAEMLRQELDR 
ERQRRMELEQKVQEVLKARTEEQMAQQPPKGQ 
AQASNGAERRSQGLSSRLQKWFYERFGEYVEDF 
RFQPEENTVETEEPLSARRLTENMRRLKRGAKPV 
.TOT3^KNLSALSDWSyYTSAlAFTVy^ . 










SEP^/H>PK£DL^VS 

MADILEKIKNLFMWVQPEITQKLYVALWAAFLA 

SCFFPYRLVGLAVGLYAGIKFFLIDFIFKRCPRLR 

AKYDTPYIIWRSLPTDPQLKERSSAAVSRRLQTTS 

SRSYVPSAPAGLGKEEDAGRFHSTKXGNFHEIFN 

LTENERPLAVCENGWRCCLINRDRKMPTDYIRN 

GVLYVTVENYLCFESSKSGSSKRl^VIKLVDITDI 

QKYKVLSVLPGSGMGIAVSTPSTQKPLVFGAMV 

HRDEAFETILSQYIKITSAAASGGDS 


3475 


A 


2 


1126 


TAARRRQKGAAAAAETHGQAKAKSGWLKPYYF 

ffiLMESRKDITNQEELWKMKPRKNLEEDDYLHK 

DTGETSMLKRPVLLHLHQTAHADEFDCPSELQH 

TQELFPQWHLPIKIAAIIASLTFLYTLLREVIHPLA 

TSHQQYFVXIPILVINKVLPMVSITLLALVYLPGV 

L\AIVQLHNGTKYKKFPHWLDKWMLTRKQFGL 

LSFFFAVLHAIYSLSYPMRRSYRYKLLNWAYQQ 

VQQNKEDALMEHDVWRMEIYVSLGIVGLAILAL 

LAVTSIPSVSDSLTWREFHYIQSKLGIVSLLLGTIH 

ALffAWl^WroiKQFVWYTPPTFMIA\ai.PIVVLI 

FKSILFLPCLRXKILKIRHGWEDVTKINKTEI^ 


3476 


A 


143 


3191 


AKAPPTGESSEPEAKVLHTKRLYRAVVEAVHRL 

DLILCm:TAYQEVFOEOTSLRNKLRELCVKLMF 

LHPVDYGRKAEELLWRKVYYEVIQLIKTNKKHI 

HSRSTLECAYRTHLVAGIGFYQHLLLYIQSHYQL 

ELQCCIDWTHVTDPLIGCKKPVSASGKEMDWAQ 

MACHRCLVYLGDLSRYQNELAGVDTELLAERFY 

YQALSVAPQIGMPFNQLGTLAGSKYYNVEAMY 

CYLRCIQSEVSFEGAYGNLKRLYDKAAKMYHQL 
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SEvi jlu " 
NO: 


Ipthnd P 

LClllVU * 

b 

D 
ll 

C 
t 

a 
I 


redicted ^ 
eginning ■> 
ucleotide ■< 
>catioii < 
orresponding t 
a first amino a 
cid residue of ( 
leptide ^ 


redicted end f> 
ucleotide E 
nation I 
orresponding > 
} last amino 1 
cid residue of > 
lepMe V 
eqnence 


mino acid sequence (A-Alanine C=Cy»teine. »°Asparnc Add. 
,miuv «v«» »v., i7_i>,.»„i»i«niiie. G=G!vcine, H=Histidine, 
glutamic Acid, F=FneByiaianine, w-uij""*) -> 
=IsoleuciDe,K=Lysine,L=Leucine,M=Metluonine, 
|i^^™gt;.e, P=Proline. Q=Gl»tan.ine, R^Argini-e. S=«erme, 
^Threonine, V=Vaiine, W=Tryptophan, Y="^rMto«. 
[=UnlMOwn, *=Stop codon, A«posabIe nndeotide deletion, 
^possible nucleotide insertion 








: 


CKCETOKLSPGKKRCkblKRLLVNtMYiAjai^V 

PKSSSVDSELTSLCQSVLEDFNLCLFYLPSSPNLS 

LASEDEEEYESGYAFLPDLLIFQMVnCLMCVHSL 

ERAGSKQYSAAIAFTLALFSHLVNHVNIRLQAEL 

^raqPWAFQSDGTDEPESKEPVEKEEEPDPEPP 

pJwQVGEGRKSRKFSRLSCLRRRRHPPKVGDDS 

DLSEGFESDSSHDSARASEGSDSGSDKSLEGGGT 

AFDAETOSEMNSQESRSDLEDMEEEEGTRSPTTLE 

PPRGRSEAPDSLNGPLGPSEASIASNLQAMSTQM 

FQTKRCFiaAPTFS>n.LLQPTTOPHTSASI^Cy 

NGDVDKPSEPASEEGSESEGSESSGRSCRNERSIQ 

EKLQVLMAEGLlPAVKVFLDWLRTWDLnVCA 

QSSQSLWNRLSVLimU'AAGELQESGLALCPEV 

ODLLEGCELPDLPSSLLLPEDMALRNLPPLRAAH 

^RFNFDTDRPLLSTLEESVVRICCIRSFGHFIARLQ 

GSILQFNPEVGIFVSIAQSEQESLLQQAQAQFRMA 

qeeaWjrlmrdmaqlrlqlevsqlegslqqpk 

AOSAMSPYLVPDTQALCHHLPVIRQLATSGRFIVI 

iprtvidgldllkkehpgardgiryleaefkkgn 

RYIRCQKEVGKSFERHKLKRQDADAWTLYKIU) 
SCKQL1\LAQGAGEEDPSGMVTIITGIJPLDNPSVL 
g^S!;\^TXAAAHASVDIKNVLDFYKQWKEIG 


3477 


A 


1 


3902 


"MIEPRERRGYS VPPRPbV UTQATEWRVEESNFN 
KIFLKKDAELGRSNHLPTWDKPEDASWLPQSCL 
GGDAVATTCEIHEEKAWKTRALEVGQPAQRDIR 

Selwgkehgadqaiqetledlsslertlvvses 
elgrMerrrqagaafqvlqlpqalpiqvdseegl 

LSTGRRLDREQLCRQWDPCLVSFDVLATGDLALI 

hveiqvldindhqprfpkgeqeleisesaslrtrip 

iSSx^PDTGPNTLHTYTLSPSEHFALDVrVGPD 

etkhaeliwkeldreihsffdlvltaydngnpp 
ksgtslvkvnvldsndnspafaesslaleiqeda 

APGTLLKLTATOPDQGPNGEVEFFLSKHMPPaV 

LDTFSroAKTGQVILRRPLDYEKNPAYTEWyQ^ 

DLGPNPIPAHCKVLIKVLDVNDNIPSIHVTWASQP 

SLVSEALPKDSFIALVMADDLDSGNNGLVHCWL 

SQELGHFRLKRTOGNTYMLLTNATLDl^QWK 

miLLAQDQGLQPLSAKKQLSIQISDINDNAPVF 

EKSRYEVSTOENNLPSLHLITIKAHDADLGINGK 

VSYRIQDSPVAHLVAlDSNTQEVTAQRSLNYraM 

AGFEFOVIAEDSGQPMLASSVSVWVSLLDANDN 

APEWOPVLSDGKASLSVLVNASTGHLLVPIETP 

NGLGPAGTDTPPLATHSSRPFLLTTIVARDADSG 

ANGEPLYSIRSG>ffiAHLFILNPHTGQi™VTOA 

SSLIGSEWEI^mnEIXJGSPPLQTRALLRVN^ 

VDHLRDSARKPGALSMSMLTVICLAVLLGIFGLI 

LALFMSICRTEKKDNRAYNCREAESTYRQQPKR 

PQKfflQKADfflLVPVLRGQAGEPCBVGQSHKDV 

^^vwATJr^/rPAn^VDPCLOAPFHLTPTLYRTLRNQG 

NOGAPAESREVLiQDTVNLLFNHPRQRNASREOT. 

Slp^patgqprsrplkvagsptgrlagdqgse 

E/^QRPPASSATLRRQRHLNGKVSPEKESGPRQI 

^lvrlsvaafaernpveeltvdsppvqqisqll 
Shqgqfqpkpnhrgnkylakpggsrsaipdto 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
. acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D^Aspartlc Acid, 
£=Glutamic Acid, F=Pbenylalanine, G=Glycine, H=Uistidine, 
I=Isoleucine, K-Lyslne, L^Leudne, M^Methionine, 
N=A5paragine, P=ProUnc, Q^lutaraine, R=Ai:ginine, S=Serinc, 
T='Threonine, V=Valine, W=Tryptophaii, y==Tyrosinc, 
X=Unknawn, *=Stop codon, /=posslble nucleotide deletion, 
\=possible nucleotide insertion 










GPSARAGGQTDPEQEEGPLDPEEDLSVKQLLEEE 

LSSLLDPSTGLALDRLSAPDPAWMARLSLPLTTN 

YRDNVISPDAAATEEPRTFQTFGKAEAPELSPTG 

niLASTFVSEMSSLLEMLLEQRSSMPVEAASEAL 

RKLSVCGRTLSLDLATSAASGMKVQGDPGGKTG 

TEGKSRGSSSSSRCL 


3478 


A 


13 


1620 


TLPPPGNSGCHRLCFPEFEFLQVTKMEFSGRKWR 
KLRLAGDQRNASYPHCLQFYLQPPSENISLIEFEN 
LAIDRVKLLKSVENLGVSYVKGTEQYQSKLESEL 
RKLKFSYRENLEDEYEPRRRDHISHFILRLAYCQS 
EELRRWFIQQEMDLLRFRFSILPKDKIQDFLKDSQ 
LQFEAISDEEKTLREQEIVASSPSLSGLKLGFESIY 
KIPFADALDLFRGRKVYLEDGFAYVPLKDIVAIIL 
NEFRAKLSKALALTARSLPAVQSDERLQPLLNHL 

SHSYTGQDYSTQGNVGKISLDQIDLLSTKSFPPC 

TiyrDOT zjv AT ocxTTJurT ^Tjnm>\Ar\'\//^i T7T v/^jnj 
MKV^LrllsJVJjKJiiS nJlLKJrluOKMy i OLr LKUlLyL 1 

LEQALQFWKQEFIKGKMDPDKFDKGYSYNIRHS 

FGKEGKRTDYTPFSCLKIILSNPPSQGDYHGCPFR 

HSDPELLKQKLQSYKISPGGISQILDLVKGTHYQ 

V\ACQKYFEMIHTVDDCGFSVLSHPNQYFCESQRI 

LNGGKDKKEPIQPETPQPKPSVQKTKDASSALA 

oLiJN ool^nMlJJVLDOLrCLI i r ot!rJLIo 


3479' 


A 


.698 


138 


RPELELWRLRSRSWRPLGVPRRCHRRNWKEPVR 

AQPLSVTVWAPRCQRP/QPPAPEPSSPNAAVPEAI 

PTPRAAASAALELPLGPAPVSVAPQAEAEARSTP 

GPAGSRLGPETFRQRFRQFRYQDAAGPREAFRQL 

REL/SPRQWLRPDI\RTKEQ\IVEM.VQEQLLAILP ' 

EAARARRIRRRTDVRITG 


3480 


A 


117 


2226 


RRGSRSRGPFAEPAAPGGLCSSSEEKTEEGGMAV 

GLCKAMSQGLVTFRDVALDFSQEEWEWLKPSQ 

KDLYRDVMLENYRNLVWLGLSISKPNMISLLEQ 

GKEPWMVERKMSQGHCADWESWWEIEELSPK 

WFIDEDEISQEMVMERLASHGLECSSFREAWKY 

KGEFELHQGNAERHFMQVTAVKEISTGKRDNEF 

SN/IWEKHTPEISIFNTTES\PTIQQVHKFDIYDKLF 

PQNSVIIEYKRLHAEKESLIGNECEEFNQSTYLSK 

DIGIPPGEKPYESHDFSKLLSFHSLFTQHQTTHFG 

KLPHGYDECGDAFSCYSFFTQPQRIHSGEKPYAC 

NDCGKAFSHDFFLSEHQRTfflGEKPYECKECNKA 

FRQSAHLAQHQRIHTGEKPFACNECGKAFSRYAF 

LVEHQRIHTGEKPYECKECNKAFRQSAHLNQHQ 

RIHTGEKPYECNQCGKAFSRRIALTLHQRIHTGE 

NJrr Jvi^OEr^vjjv 1 r vj I Korli^iN v^rlv^KXrl i viJciSJr x JDUl 

KCGKFFRTDSQLNRHHRIHTGERPFECSKCGKAF 

SDALVLIHHKRSHAGEKPYECNKCGKAFSCGSY 

LNQHQRIHTGEKPYECSECGKAFHQILSLRLHQRI 

HAGEKPYKCNESQRVRRSELAVSRGLTTKPADT 

GPDSTLNAAKVAEPARAGTEAALRPALSVAESA 


3481 


A 


2 


1522 


ASRHGMTPGALLMLLGALGPPLAPGVRGSEAEG 
RLREKLFSGYDSSVRPAREVGDRVRVSVGLILAQ 
LISLNEKDEEMSTKVYLDLEWTDYRLSWDPAEH 
DGIDSLRITAESVWLPDWLLNNNDGNFDVALDl 
SVWSSDGSVRWQPPGIYRSSCSIQVTYFPFDWQ 
NCTMVFSSYSYDSSEVSLQTGLGPDGQGHQEIHI 
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SEQ ID M 
NO: 


ccnoa » 
b 
n 

' h 
c 
t 
a 
F 


redicted P' 
eginning ni 
udeotide lo 
>cation c< 
orresponding tc 
D first amino a 
dd residue of p 
eptide SI 


redicted end a 
ideotide E 
cation ^' 
)rresponding ^ 
^ last amino 1 
rid residue of X 
eptide V 
^uence 


mine add sequence (A=AlaDine 0=t;yste.ne, '^^^f'^^^^'" 
uiiiiv . .J T' Du^nvioianinp f;=Glvcine, H=Histidinc, 
=Glutainic Acid, F=P*»^"y*^*^"*°^\,^'^7. ^ 
=Isolcudnc,K=Lysinc,L=Leucine,M=MetlHonine, 

Sr^^ne.l4«..ine.Q=Gl«ta,nine,R=Ar^^^ 

^Threonine, V=VaJine, W=Tryptophaii, Y=Ty<»ii«, 

:=UnIu.own>=Stop codoD, /=possibIe nudeotide deleUon, 

^possible nudeotide insertion 






cquence 


j 
] 


ffiGTFIENGQWENIHKPSRLl(jPPUl>PRGGRfcWj 
lOEVIFYLIIRRKPLFYLVNVlAPCILlTLLAIFVFY 
.WDAGEKMGLSIFALLTLTVFLLLU^KV^^^^^^ 
SVPniKYLMnMVLVTTSVE-SWVLNLHHRSPH 

SHSGSGWGRGTOEYFmKPPSDFLFPKPNRP 

SSdlf^gpnravallpelrevvssisyi 
SqeqSalkedwqfvamvvdrlflwtt 

ITFTSVfiTIAVIFLDATYHLPPPDPFP 


3482 




1273 


172 


TOWDSGGADAHWYALADWTAV WLPRSBFTm 

L0TGEG5mim.PAGMPPDSPRELW^^ 

SDPALPWILGHGNQPPAWPEPQGPMGPAGVAA 

vSmaGGRKKGGA\GRTSGRGPWEMVL>miGF 
PSSVAALRFEWAWQHPHASim^HVGmRG^^ 

afWrvlahmlrappwarlpltlrwvrpdlr 

OT)LCLPPPPHVLLAFGPPPAQVPRPQRRRAGPFD 
SSrepSoDPGACCSLCAQ-nQDEEGPLCCPHP 

SJ^Jr^olaeeflqeepgqllplegqcpcce 
ksllwgdliwlcqmdtekevedseleeahwtd 

T LET 


3483 


A 


230 


3686 


wrpwpcidtswnlqvaartlrvssaqcglw 1 
marvespvpaarasltgscvlgqamplrggagp 

SPASHGPraGPSDPRTCLPGRGAGGMKPHGRGA 

lgccglcsfytchgaagdeimhqdivplcaadiq 

KEFQNVMTYL-reiPSLQDAGIGFILVIDRRRDKW 
SXsVLRIAASFPANLQLVLVLRPTGFFQRTLS 

nodqldnqatvqrllaqlnetoaafdefwakh 

SSScLQUU#EQGFREVKAILDy^^^^^ 

toignslahWhllrdlanfqeksgvfverara 

lsltassfignkhyavdsirpkcqelrhlcdqfsa 

eSrrrgllskslelhrrletsmkwcdegiylla 

sopvdkcqsqdgaeaalqeiekfletgaenkiqe 

lnaiykeyesilnqdlmehvrkvfqkqasmeev 

FXoA^Lm.AARQrRPVQPVAPRPEALAKSP 

SSrrSnssseggalrrgpyrraksemses 

rogrgsageeeeslailrrhvmselldterayve 

ellcvlegyaaemdnplmahllstglhnkkdv 

lfgnmeeiyhfhnriflrelenytdcpelvgrcf 

lSSqiyekycqnkprseslwrqcsdcpffq 

e^^dS^ldsyllkpvqritkyqlllkem 

lkysbncegaedlqealssilgilkavndsmhli 

AITGYDGNLGDLGKLLMQGSFSWTOI^GHr 

kvkelarfkpmqrhlflhekavlfckkreenge 

rvw AP«?YSYKOSLNMAAVGITEhrVKGDAKKFE 
f^SE^^Q^^DCAAWVNEIRKVLTSQ 
LO^EASQHRALEQSQSLPLPAPTSTSPSRGNSR 

I^^SeiStopSlegyvssaplikppekgkgw 

SKTSHSLEAPEDDGGWSSAEEQINSSDAEEDGGL 
GSvTOKYrVVAnmKGGPDALRVRSGDVV 
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SEQID 
NO: 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino . 
add residue of 
peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A^^AIanine C=Cysteine, D^Aspartic Acidt 
E>=GIutamic Acid, F=PhenylaIanine, G^Glycine, H»Hlstidioe, 
I=Isolencine, K=Lysine, I^Lencine, M=Methionine» 
N=Asparagine, Fr=ProUne, Q^GIutamine, R^Arglnine, S==Serinc, 
T=ThreoniDc, V=VaIine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /=po5sible nucleotide deletion, 
\=possibIe nucleotide insertion 



ELVQEGDEGLW 



3484 



208 



6103 



VTMAQQAADKYLYVDKNFINNPLAQADWAAK 

KLVWVPSDKSGFEPASLKEEVGEEAIVELVENGK 

KVKVNKDDIQKMNPPKFSKVEDMAELTCLNEAS 

VLHNLBCERYYSGLIYTYSGLFCVVINPYKNLPIYS 

EEIVEMYKGKKRHEMPPHIYAITDTAYRSMMQD 

REDQSILCTGESGAGKTENTKKVIQYLAYVASSH 

KSKKDQGELERQLLQANPILEAFGNAKTVKNDN 

SSRFGKFIRINFDVNGYIVGANIETYLLEKSRAIRQ 

AKEERTFHIFYYLLSGAGEHLKTDLLLEPYNKYR 

FLSNGHVTIPGQQDKDMFQETMEAMRIMGIPEEE 

QMGLLRVISGVLQLGNIVFKKERNTDQASMPDN 

TAAQKVSHLLGINVTDFTRGILTPRIKVGRDYVQ 

KAQTKEQADFAIEALAKATYERMFRWLVLRINK 

ALDKTKRQGASFIGILDIAGFEIFDLNSFEQLCINY 

TNEKLQQLFNHTMFILEQEEYQREGIEWNFroFG 

LDLQPCIDLIEKPAGPPGILALLDEECWFPKATDK 

SFVEKVMQEQGTHPKFQKPKQLKDKADFCIIHY 

AGKVDYKADEWLMK>fi^PLNDNIATLLHQSSD 

KFVSELWKDVDRUGLDQVAGMSETALPGAFKT 

RKGMFRTVGQLYKJEQLAKIJVIATLRNTNPNFVR 

CnPNHEKKAGKLDPHLVLDQLRCNGVLEGIRICR 

QGFPNRWFQEFRQRYEILTPNSIPKGFMDGKQA 

CVLMIKALELDSNLYRIGQSKVFFRAGVLAHLEE 

ERDLKITDVnGFQACCRGYLARKAFAKRQQQLT 

AMKVLQRNCAAYLKLRNWQWWRLFTKVKPLL 

;(3ySi«5EEEMM^ 

METLQSQLMAEKLQLQEQLQAETELCAEAEELR 

ARLTAK\KQ\ELEEICHDLEARVEEEEERCQHLQA 

EKKKMQQNIQELEEQLEEEESARQKLQLEKVTT 

EAKLKKLEEEQIILEDQNCKLAKEKKLLEDRIAEF 

TThn^TEEEEBLSKSLAKLKKKHEAR^^ 

EEKQRQELEKTRRKLEGDSTDLSDQIAELQAQMA 

ELKMQLABCKEEELQAALARVEEEAAQKNMALK 

KIRELESQISELQEDLKCERVASRNKAEKQKRDLG 

EELEALKTELEDTLDSTAAQQELRSKREQEVNEL 

KKTLEEEAKTHEAQIQEMRQKHSQAVEELAEQL 

EQTKRVKANLEKAKQTLENERGELANEVKVLLQ 

GKGDSEHKRKKVEAQLQELQVKFNEGERVRTEL 

ADKVTKLQVELDNVTGLLSQSDSKSSKLTKDFS 

ALESQLQDTQELLQEENRQKLSLSTKLKQVEDE 

KNS\FREQLEEEEEEAKHNLEKQIATLHAQVADM 

KKKMEDSVGCLETAEEVKRKLQKDLEGLSQRHE 

EKVAAYDKLEKTKTRLQQELDDLLVDLDHQRQ 

SACNLEKKQKKFDQLLAEEKTISAKYAEERDRA 

EAEAREKETKALSLARALEEAMEQKAELERLNK 

QFRTEMEDLMSSKDDVGKSVHELEKSKRAIEQQ 

VEEMKTQLEELEDELQATEDAKLRLEVNLQAM 

KAQFERDLQGRDEQSEEKKKQLVRQVREMEAE 

LEDERKQRSMAVAARKKLEMDLKDLEAHDDSA 

NKl^EAIKQLRKLQAQMKDCMRELDDTRASR 

EEILAQAKENEKKLKSMEAEMIQLQEELAAAER 

AKRQAQQERDELADEIANSSGKGALALEEKRRL 

EARIAQLEEELEEEQGNTELINDRLKKANLQIDQI 

NTDLNLERSHAQKNENARQQLERQNKELKVKL 
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ct?f\ m 1 IV 
NO: 


[ethod 1 Pi 
b( 

1 

lo 

C( 

tt 
a 
P 
1 s 


redicted 1 P 
^ginning n 
ucleotide li 
cation c 
jrresponding t 
> first amino a 
cid residue of ( 
eptide s 
eanence 


redicted end a 
ucleotide £ 
icatioD 1' 
orresponding ^ 
0 last amino 1 
cid residue of > 
peptide 
equence 


mino acid sequence (A-Alanine C=<Jysteinc, ^^^^V^!^.^'''^' 

JL. ^ • kIia 17-Phpnvlfll9nine G=Glycine, H=Histidine, 
r=Glutamic Acid, i<— Fnenymianmc, vijwum •* 

=Isoleudne, K=Lysine, D=Leudne, M=Methiomne, 
I=Asparagine, P=I>roIine, Q=Glutamine, R=Arginine, S=Senne, 
WThreontoe, V=V!iIine, W=TryptophaD, Y=Tyrosine, 
[=Unkn(wii, *=Stop codon, /"posable nucleotide ddetiOD, 
^possible Dudeotide insertion 








{ 
] 

; 


:>EMEGTVKSKYKASITALBAKIAQLEBQLt)NblJ«. 

EROAACKQVRRTEKKLKDVLLQVDDERRNAEQ 

nCDQADKASTRLKQLKRQLEEAEEEAQRANASR 

m.QREI^ATETADAMNREVSSLKNKLRRGDL 

PFVWRRMARKGAGDGSDEEVDGICADGAEAKP 

AP 


3485 


A. : 


I 


1782 


CSTGVSKAPLTYLMSYGFELGWRKGNKAVACK. 

EDRGGESVGMGQESILSQVHWWEAEPVEKTPGR 

DSEATIMSLRVHTLFrLLGAVVRPGCRELLCLLM 

ITVTVGPGASGVCPTACICATDIVSCTOKNLSKVP 

GNLFRLKRLDLSWRIGLLDSEWIPVSFAKLNTL 

n.RHNNI-reiSTGSFSTTPNLKCLDLSSNKLK-nVK 

NAVFQELKVLEVLLLYNNfflSYLDPSAFGGLSQL 

OKLYLSGNFLTQFPMDLYVGRFKLAELMFLDVS 

YNRIPSMPMHHINLVPGKQLRGIYLHGNPFVCDN 

CSLVSLLVFWYRRHFSSVMDFKNDYTCRLWSDS 

RHSROVLLLQDSFMNCSDSIINGSFRALGFIHEAQ 

VGEKJLMVHCDSKTGNANTDFIWVGPDNRLLEPD 

KEMENFYVFHNGSLVIESPRFEDAGVYSCIAMNK 

ORLLNETVDVTINVSNFTVSRSHAHEAFNTAFTT 

LAACVASIVLVLLYLYLTPCPCKCK1KRQKNML 

HOSNAHSSILSPGPASDASADERKAGAGKRVyn. 

EPLKDTAAGQNGKVRLFPSEAVIAEGILKSTRGK 

snfinsVNSVFSDTPFVAST 


3486 


A 


357 


1173 


GDPRETKVFPSRSFARNTVUVSHHgSHLmi VSK 
TYVEDKHKILYCBVPKAGCSNWKRILMVLNGLA 
-SSAYNISHNAVHYGKHLKJaDSFDUCGIYTKL 
YTK^^VLVRDPMERLVSAFRDKFDHPNSYYHPVF 
GKAIKKYRPNACEEALINQSGVKFKEFIHYLLDS 
HRPVGMDIHWEKVSKLCYPCLINYDFVGKFETL 
EEDANYFLQMIGAPKELKFPNFKDRHSSDERTNA 
QWRQYLKDLTRTERQLIYDFYYLDYLMFNYTT 

PFL 


3487 


A 


2 


3281 


" CDKSGAVPFSriRSPKKPSPRSAGPSLSSV5h'J«)y 
LWASSGLSEEHAAPLLPAWPRHPCPPSLTTCPSM 
AOGAMRFCSEGDCAISPPRCPRRWLPEGPVPQSP 

pSmygstgsllrrvagpgprgrelgrvtapctp 

LRGPPSPRVAPSPWAPSSPTGQPPPGAQSSWIFR 

fJ^kasvrplnglpapgglsrswdlggvspprpt 

palgpgsnrklrleastsdplparggsalpgsrn 

lvhgppappqvgadglysslpnglgdpperlatl 

fggpadtgflnqgdtwssprevsshaqriarak 

weffygsldppssgakppeqappsppgvgsrqgs 

gvavgraakysetdldtvplrcyretdidevla 

ereeadsabesqpssegppgtayppaprpgplpgp 

hpslgsgnededddeaggeedvddevfeasega 

rpgsrmplkspwflpgtspsadgpdsfscvfeai 

Sshrakgtsytslaslealaspgptqspfftfel 

PPOPPAPRPDPPAPAPLAPLEPDSGTSSAADGPWT 

orgeeeeaearaklapgreppspchsedslglga 
aplgsepplsqlvsdsdseldsterlalgstdtls 

NGQ^LEy^QRLAKRLYRLDGFRKADVARHL 
GKiW>FSKLVAGEYUCFFWTGMI1X)QALRWL 

SEgetqerervlahfsqryfqcnpealsse 

f;^S?"TnALMLXNTDLHGHmGKRMTC 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A-Alanine C^ysteine, I>=>Aspartic Acid, 
EXvlutamic Add, F=Phenylalanine, G=<?iycine, H-HistidiDe, 
I=IsoIeudne, K=Lysine, L«=Leucine, M=Metbionine, 
N=AsparagiDe, P=Proline, Q=Glutaminc, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=UnknoTvn, *=Stop codon, /=posslbIc nudeotide deletion, 
V=possible nucleotide insertion 










NLEGLNDGGDFPRELLKALYSSIKNEKLQWAIDE 

EELRRFLSELADPNPKVIKRISGGSGSGSSPFLDLT 

PEPGAAVYKHGALVRKVHADPDCRKTPRGKRG 

WKSFHGILKGMILYLQKEEYKPGKALSE'IELKN 

AISIHHALATRASVhTYSKRPHVFYLRTADWRVI^ 

FQAPSLEQMQSWITRINWAAMFSAPPFPAAVSS 

QKKFSRPLLPSAATIILSQEEQVRTHEAKLKAMA 

SELREHRAAQLGKKGRGBLEAEEQRQKEAYLEFE 

KSRYSTYAALLRVKLKAGSEELDAVEAALAQAG 

STEDGLPPSHSSPSLQPKPSSQPRAQRHSSEPRPG 

AGSGRRKP 


3488 


A 


441 


1968 


GTETPHCWGRGTAGLRRELDREERDGPGTATMS 

FPHFGHPYRGAFQFLVASASSSTTCCESTLRSVSY 

VASGSTPAPALCCAPXYDSRLLGSARPELGAALGI 

YGAPYAAAAAAQSYPGYLPYSPEPPSLYGALNP 

QYEFKEAAGSFTSSLAQPGAYYPYERTLGQYQY 

ERYGAVELSGAGRRKNATRETTSTLKAWLNEHR 

KlSfPYPTKGEKIMLAHTmTLTQVSTWFANARRR 

LKKENKMTWAPKNKGGEERKAEGGEEDSLGCL 

TADTKEVTASQEARGLRLSDLEDLEEEEEEEEEA 

FTiPPWATAnnRT TFPRKTiAn^T PfJPPAAARPrr 

RLERRECGLAAPRFSFNDPSGSEEADFLSAETGSP 

RLTMHYPGLEKPRIWSLAHTATASAVEGAPPARP 

RPRSPECRMIPGQPPASARRLSVPRDSACDESSCI 

PKAFGNPKFALQGLPLNCAPCPRRSEPWQCQYP 

SGAEGSGPPAALGVSMQKTPTY.RPARQLHTLCH 

SSLP • 


3489 


A 


718 


2073 


lAAYHKALSYRGHVHANNRGTNNVHFITPPSPS 

RGILPMNPRNMMNHSQVGQGIGPSRTNSMSSSG 

LGSPNRSSPSIICMPKQQPSRQPFTVNSMSGFGMN 

RNQAFGMNNSLSSNIFNGTDGSENVTGLDLSDFP 

ALADRNRREGSGNPTPLINPLAGRAPYVGMVTK 

PANEQSQDFSIHNEDFPALPGSSYKDPTSSNDDSK 

SNLNTSGKTTSSTOGPKFPGDKSSTTQNNNQQKK 

GIOVLPDGRVTNIPOGMVTDOFGMIGLLTFIRAA 

ETDPGMVHLALGSDLTTLGLNLNSPENLYPKFAS 

PWASSPCRPQDIDFHVPSEYLTNIHIRDKLFFFFS 

W/TAIKLGRYGEDLLFYLYYMNGGDVLQLLAAV 

ELFl^WRYHKEERVWITRAPGMEPTMKTNTY 

ERGTYYFFDCLNWRKVAKEFHLEYDKLEERPHL 

PSTFNYNPAQQAF 


3490 


A 


2 


2833 


FVAKMATSQYFDFAQGGGPQYSTQAPTLPLPTV 

GASYTGQPTPGMDPAVNPAFPPAAPAGYGGYQP 

HSGQDFAYGSRPQEPVPTATTMATYQDSYSYGQ 

SAAARSYEDRPYFQSAALQSGRMTAADSGQPGT 

QEACGQPSPHGSHSHAQPPQQAPIVESGQPASTL 

SSGYTYPTATGVQPESSASIVTSYPPPSYNPTCTA 

YTAPSYPNYDASVYSAASPFYPPAQPPPPPGPPQ 

QLPPPPAPAGSGSSPRADSKPPLPSKLPRPKAGPR 

QLQLHYCDICKISCAGPQTYREHLGGQKHRKKE 

AAQKTGVQPNGSPRGVQAQLHCDLCAVSCTGA 

DAYAAHIRGSKHQKVFKLHAKLGKPIPTLEPALA 

TESPPGAEAKPTSPTGPSVCASSRPALAKRPVASK 

ALCEGPPEPQAAGCRPQWGICPAQPKLEGPGAPT 

QGGSKEAPAGCSDAQPVGPEYVBEVFSDEGRVL 
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peptide i 
icqnence 


'redicted end fi 
ncleotide 1 
ocation 1 
orresponding f 
0 last amino 
icid residue of ^ 
>eptide ^ 
sequence 


imino acid sequence (A^AIanine C=Cysteinc, "^sparnc acq, 
Xiliitainic Acid, F^FIienyiaiaiiine, ij— oijruut, i» j««»«>.i » 
=Isoleucine, K=Lysine, D=Leucine, M=Methioniiie, 
<=Asparagine, P=Proline, Q=Glutamine, R=Argmlne. S=Sertae, 
r=Threoiiine, V=Valine, W=Tryptophan, Y=a^yroslne, 
i=Unknown, *=S»op coiloii,y*-possible nudeottde delebon, 
ppossible nucleotide insertion 










RFHCKLCECSFNULNAKDLHVRGRRHKLg )i KMf^ 

VNPDLPIATEPSSRARKVLEERMRKQRHLAEERL 

EOLRRWHAERRRLEEEPPQDVPPHAPPDWAQPL 

LMGRPESPASAPLQPGRRPASSDDRHVMCKHATl 

YPTEQELLAVQBAVSHAERALKLVSDTLAEEDR 

GRREEEGDKRSSVAPQTRVLKGVMRVGILAKGL 

LLRGDRNVRLALLCSEKFTHSLLRMAQQLPRQL 

OMVTEDEYEVSSDPEANIVISSCEEPRMQVnSVT 

^LMREDPSTDPGVEEPQADAGDVLSPKKCLESL 

AALRHARWFQARASGLQPCVIVIRVLRDLCRRV 

PTNWGALPAWAMELLVEKAVSSAAGPLGPGDAV 

RRVLECVATCTLLTDGPGLQDPCERDQTDALEP 

MTLQEREDVTASAQHALRMLAFRQTHKVLGMD 

LLPPRHRLGARJRKRQRGPGEGEEGAGEKKRGR 

RGGEGLV 


3491 


A 


2 


1321 


FVGDGALSGCRRGRAPRVPSMAGSLPPC V vuuu 

TGYTKLGYAGNTEPQniPSCIAIRESAKVVDQAQ 

RRVLRGVDDLDFFIGDEATOKPTYATKWPIRHGII 

EDWDLMERFMEQWFKYLRAEPEDHYFLMTEP 

PLNTPENREYLAEIMFESFNVPGLYIAVQAVLAL 

AASWTSRQVGERTLTGIVIDSGDGVTHVIPVAEG 

YVIGSCIKHIPIAGRDITYFIQQLLREREVGIPPEQS 

LETAKAKEKYCYICPDIVKEFAKYDVDPRKWIK 

QYTGINAINQKKFVIDVGYERFLGPEIFFHPEFAN 

PDFMESISDWDEVIQNCPIDVRRPLYKNVVLSG 

GSTMFRDFGRRLQRDLKRWDABLRLSEELSGG\ 

RIKPKPVEVQVVTHHMQRYAV\WFGG\SMLASTP 

FFFOVCHTKKDYEEYGPSICRHNPVFGVMS 


3492 


A 


3 


2024 


piSIGVALLHLPGAAVlPNTNYMFQDALUUKbKvja 

REESPAPSRAPASASLWRRLVWEAKMAAHAAA 

AAQAAAAQAAHAEAADSWYLALLGFAEHFRTS 

SPPKJDRLCVHCLQAVFPFKPPQKEEARTHLQLGSV 

LYHHTKNSEQARSHLEKAWLISQQIPQFEDVKFE 

AASLLSELYCQENSVDAAKPLLRKAIQISQQTPY 

WHCRLLFQLAQLHTLEKDLVSACDLLGVGAEY 

ARWGSEYTRALFLLSKGMLLLMERXLQEVHPL 

LTLCGOrVENWQGNPIQKESLRVFFLVLQVTHYL 

DAGQVKSVKPCLKQLQQCIQTISTLHDDEILPSNP 

ADIJHWLPKEHMCVLVYLVTVMHSMQAGYLE 

KAOKYTDKALMQLEKLKMLDCSPILSSFQVILLE 

HIMCRLVTGHKATALQEISQVCQLCQQSPRLFS 

NHAAQLHTLLGLYCVSVNCMDNAEAQFTTALR 

LTNHQELWAFIVTNLASVYIREGNRHQEWNLYS 

LLERINPDHSFPVSSHCLRAAAFYVRGLFSFFQGR 

YNEAKRFLRETLKMSNAEDLNRLTACSLVLLGHI 

FYVLGNHRESNNMVVPAMQLASKIPDMSVQLW 

SSALLRDLNKACGNAMDAHEAAQMHQNFSQQL 

LQDHIEACSLPEHNLITWTDGPPPVQFQAQNGPN 

TSLASLL 


3493 


A 


3 


2024 


REESPAPSRAPASASLWRRLVWEAKMAAHAAA 
AAQAAAAQAAHAEAADSWYLALLGFAEHFRTS 
SPPKIRLCVHCLQAVFPFKPPQRIEARTHLQLGSV 
LYHHTKNSEQARSHLEKAWLISQQIPQFEDVKFE 
AASLLSELYCOENSVDAAKPLLRKAIQISQQTPY 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cy5teine, INAspartic Acid, 
E=<?lutamic Acid, F=PhenyIalanine, G^GIycine, H-Histidine, 
I»Isoleuclne, K^Lyslne, L«=I^udne, M=Methionlne, 
N=Asparagine, P=Proline, Q^lutamine, R'=Argioine, S=<Serijie, 
T^Threonine, V=Vallnc, W^Tryptophan, Y-Tyrosinc, 
X-Unknown, *=Stop codon, /^possible nudeotide deletion, 
V»pos5ible nudtotide insertion 










WHCRLLFQLAQLHTLEKDLVSACDLLGVGAEY 

ARWGSEYTRALFLLSKGMLLLMERKLQEVHPL 

LTLCGQIVENWQGNPIQKESLRVFFLVLQVTHYL 

DAGQVKSVKPCLKQLQQCIQTISTLHDDEILPSNP 

ADLFHWLPKEHMCVn.VYLVTVMHSM 

KAQKYTDKALMQLEKLKMLDCSPILSSFQVILLE 

HIIMCRLVTGHKATALQEISQVCQLCQQSPRLFS 

NHAAQLHTLLGLYCVSVNCMDNAEAQFTTALR 

T TXrWOCT WAFTVTMI AQVVTOPmsJPXiriPWM VQ 
jLliNlll^JCri-»WArlV IXSljAoV I iJtvtl>ur^l\Jtll^C>V VVJL I o 

LLERINPDHSFPVSSHCLRAAAFYVRGLFSFFQGR 

YNEAKRFLRETLKMSNAEDLNRLTACSLVLLGHI 

FYVLGNHRESNNMVVPAMQLASKIPDMSVQLW 

SSALLRDLNKACGNAMDAHEAAQMHQNFSQQL 

LQDHIEACSLPEHNLITWTDGPPPVQFQAQNGPN 

TSLASLL 


3494 


A 


2 


1615 


VLRGQRGPAGGLAEERRRGRNEWRIHDVTTAPF 

PGLVQRRSRLLIVSQVRYFLKNKVSPDLCNEDGL 

TALHQCCIDNFEErVKLLLSHGANVNAKDNELW 

TPLHAAATCGHINLVKILVQYGADLLAVNSDGN 

MPYDLCEDEPTLDVIETCMAYQGITQEKINEMRV 

APEQQMIADIHCMIAAGQDLDWIDAQGATLLHI 

AGANGYLRAAELLLDHGVRVDVKDWDGWEPL 

HAAAFWGQMQMAELLVSHGAN\LNARTSMDE 

MPmLCEEEEFKVLLLELKXHKHDVIMKSQLRHK 

SSLSRRTSHRQAS/SVGKWRRTQPVGTGPNLWR 

VCVC/I^T?!? A TT WrM> CAVA "DFir^'D'TCTVTvT/^TkTD CTMJ 

'^TOQENlODPhlPRLE^ 

ENGLRAPVSAYQYALANGDVWKVHEVPDYSM 

AYGNPGVADATPPWSSYKEQSPQTLLELKRQRA 

AAKLLSHPFLSTHLGSSMARTGESSSEGKAPLIG 

GRTSPYSSNGTSVYYTVTSGDPPLLKFKAPIEEM 

EEKVHGCCRIS 


3495 


A 


327 


1078 


APMADTTPNGPQGAGAVQFMMTNKLDTAMWL 

CT?T 17P\/VPQ A I T7\n PT T m T-TC A A CT7VOP ATT A XT A 

LTSALRLHQRLPHFQLSRAFLAQALLEDSCHYLL 

YSLIFWSYPVTMSIFPVLLFSLLHAATYTKKVL\ 

DARG\SNSLPLLR\SVLDKLSANQQNILKFIACNEI 

FLMPATVFMLFSGQGSLLQPFIYYRFLTLRYSSKR 

NPYCRTLFNELRTVVEHIIMKPACPLFVRRLCLQS 

lAHSRLAPTVP 


3496. 


A 


3 


2867 


SSRTREMEEKEILRRQIRLLQGLIDDYKTLHGNAP 

APGTPAASGWQPPTYHSGRAFSARYPRPSRRGYS 

SHHGPSWRKKYSLVNRPPGPSDPPADHAVRPLH 

GARGGQPPVPQQHVLERQVQLSQGQNWIKVKP 

PSKSGSASASGAQRGSLEEFEDTPWSDQRPREGE 

GEPPRGQLQPSRPTRARGTCSVEDPLLVCQKEPG 

KPRMVKSVGSVGDSPREPRRTVSESVIAVKASFP 

SSALPPRTGVALGRKLGSHSVASCAPQLLGDRRV 

VTCRTNKFRXIWYKWVAASSKSPRVARRALSPR 

VAAEm^CKASAGMANKVEKPQLIADPEPKPRKP 

ATSSKPGSAPSKYKWKASSPSASSSSSFRWQSEA 

GSKDHASQLSPVLSRSPSGD\RPALAHSGLKPLSG 

ETPLSAYKVKTRTXnRRRGSTSLPGDKKSGTSPA 

ATAKSHLSLRRRQALRGKSSPVLKKTPNKGLVQ 
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NO: 



Method 



Predicted 
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nucleotide 
location 
corresponding 
to first amino 
add residne of 
I peptide 



Predicted end 
nndeolide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



3497 



1586 



141 



3498 



790 



3499 



31 



3500 



190 



1586 



185 



Amino acid sequence (A=Alanine C=Cysteine, I>=As^[»^/r«»» 
E-Glutamic Add, P^Phenylalaninc, G-Glycinc, H^Ifistidme, 
Msoleudne, K-Lyslnc. L^Lcudne, M-Methionmc, 
N=Asparagine, P-ProHne, Q=GIutammc, R=Argmine, S-Scnnc, 
T^Thrconine, V^Valine, W^Tryptophan, Y=Tyrosinc, 
X=Unlinown. *=Stop codon, A-possiWe nudeotidc deletion, 
\FpossibIe Dudeotide insertion 



VTKHRLCRLPPSRAHLPTKEASSLHAVRTAKlbK 

VIKTRYiaVKKTPASPLSAPPFPLSLPSWRABKLS 

LSRSLVLNRLRPVASGGGKAQPGSPWWRSKGYR 

CIGGVLYKVSANKLSKTSGQPSDAGSRPLLRTGR 

LDPAGSCSRSLASRAVQRSLAIIRQARQRREKRK 

EYCMYYNRFGRCNRGERCPYIHDPEKVAVCTRF 

VRGTCKKTDGTCPFSHHVSKEKMPVCSYFLKGI 

CSNSNCPYSHVYVSRKAEVCSDFLKGYCPLGAK 

CKKKHTLLCPDFARRGACPRGAQCQLLHRTQKR 

HSRRAATSPAPGPSDATARSRVSASHGPRKPSAS 

ORPniQlPSSAALTAAAVAAPPHCPGGSASPSSS 

KASSSSSSSSSPPASLDHEW^SLQEAALAAACSN 

RLCKLPSFISLQSSPSPGAQPRVRAPRAPLTKDSG 

I KPLHIKPRL _— - 

' ATAKDLGCAR taDRVVMHSlPSRGLNKVHUj^^R 
NLQEFLQGLSPGVLDRLYGHPATCLAVFRELPSL 
AldWvMRMLFLEQPLPQAAVALWVKOFSKA 
OEESTGLLSGLRIWHTQLLPGGLQGLmNPIFRQN 
LRIALLGGGKAWSDDTSQLGPDKHARDVPSLDK 
YAEERWEVVLHFMVGSPSAAVSQDLAQLLSQA 
GLMKSTEPGEPPCITSAGFQFLLLDTPAQLWYFM 
LOYLQTAQSRGMDLVEILSFLFQLSFSTLGKDYS 
VEGMSDSLLNFLQHLREFGLVFQRKRKSRRYYP 
T/RALAINLSSGVSGAGGTVHQPGFIV\VETOYRL 
YAYTESELQ1ALIALFSEMLYPFP\NMVVURVTR\ 
ESVQQAIASGITAQQIIHFLRTRAHPVMLKQTPVL 
PPTITDOIRLWELERDRLRFTEGVLYNQFLSQVDF 
ELLVLAHAPKLGVLVFEmTPAkRLMWTPAGHS 

nvKRFWKRQ KHSS. 

RPLGPAALMTASASSFSSSQUV(j QPSIYSl-t>(^liK 



SLFLSNGVAANDKLLLSSNRTTAIVNASVGSGQRI 
LRG\LQYIKVPVTDARDSRLYDFFDPIADLIHTyS 
MROGRTLLNCMAGVMSRSASLCLAYLMKYHSM 
S\LLDAHTWAm:SRRPIIRPNNGFWEQLINYEFK 
T.FNHvlNTVRMINSPVGNIPDryEKDLRMMrSM 



TAGFLLAPLHMQRLLTPVKRILQLTRAVytii bi. i 
PARLLPVAHQRFSTASAVPLAKTDTWPKDVGIL 
ALEVYFPAQYVDQTDLEKYNNVEAGKYTVGLG 
OTRMGFCSVQEDINSLCLTWQRLMERIQLPWD 
SVGRLEVGTETIIDKSKAVKTVLMELFQDSGNTD 
lEGIDTTNACYGGTASLFNAANWMESSSWDGRY 
AMVVCGDIAVYPSGNARPTGGAGAVAMLIGPK 
APLALERGLRGTHMENVYDFYKPNLASEYPIVD 
GKLSIQCYLRALDRCYTSYRKKIQNQWKQAGSD 
RPFILDDLQYMIFinTFCKMVQKSLARLMFNDF 
LSASSDTQTSLYKGI^GGLKIJEDTYTOKDLD 
KALLKASQDMFDKKTKASLYLSTHNGNMYTSSL 
YGCLASLLSHHSAQELAGSRIGAFSYGSGLAASF 
FSFRVSQDAAPGSPLNDKLVSSTSDLPKRLASRKC 
VSPEEFTEIMNQREQFYHKVNFSPPGD1KSLFPGT 
WVT .ERVDEQHRRKYARRPV 



2692 



MLPTEVPQSHPGPSALLLLQLLL PPTSAFFPNlWS 

IXAAPGSITOQDLTEEAALNVTLQLFLEQWPGRP 
PLRLEDFLGRTLLADDLFAAYFGPGSSRRFRAAL 
r.wgB ANA AODFLPTSRNDPDLHFDAERLGQGR 
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SEQID 
NO: 


Method 


Predicted 

beginniog 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A='Alanine C^ysteinei D«Aspartic Add, 
E^Glntamic Acid, F=Phenylalanine, G^lycine, H^Histidine, 
I»Isoleucine> K«Lysine» ]>Leucine^ M-Methionine, 
N=A5paragine, P=ProUne, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W^Tryptophan, Y«Tyrosine, 
X'^Unknown, *=^top codon, /=possible nudebtide deletion, 
V^possible nocleotide insertion 










ARLVGALRETWAARALDHTLARQRLGAALHA 

LQDFYSHSNWVELGEQQPHPHLLWPRQELQNLA 

QVADPTCSDCEELSCPRNWLGFTLLTSGYFGTHP 

PKPPGKCSHGGHFDRSSSQPPRGGINKDSTSPGFS 

PHHMLHLQAAKLALLASIQAFSLLRSRLGDRDFS 

RLLDITPASSLSFVLDTTGSMGEEINAAKIQAIIHL 

VEQRRGSPMEPWIYVLVPFHDPGFGPVFTTSDPD 

SFWQQLNEIHALGGGDEPEMCLSALQLALLHTPP 

LSDIFVFTDASPKDAFLTNQVESLTQERRCRVTFL 

VTEDTSRVQGRARREILSPLRFEPYKAVALASGG 

EVIFIXDQHIRDVAAIVGESMAALVTLPLDPPVV 

VPGQPLVFSVDGLLQKITVRIHGDISSFWIKNPAG 

VSQGQEEGGGPLGHTRRFGQFWMVTMDDPPQT 

GTWEIQVTAEDTPGVRVQAQTSLDFLFHFGIPME 

DOPHPfiT^YPT TOPVAfil OTDT T VPVXnT fT<3P ATM 

PGDPQPHFSHVILRGVPEGAELGQVPLEPVGPPE 

RGLLAASLSPTLLSTPRPFSLELIGQDAAGRRLHR 

AAPQPSTVVPVLLELSGPSGFLAPGSKVPLSLRIA 

SFSGPQDLDLRTFVNPSFSLTSNLSRAHLELNESA 

WGRLWLEVPDSAAPDSVVMVTVTAGGREANPV 

PPTHAFLRLLVSAPAPQDRH 


3501 


A 


1245 


5815 

* 


RRAHPSHSRLSPYLSVSRDPYFFVTVSRTILTLSA 

PAPPRRTPAPSMGTALLQRGGCFLLCLSLLLLGC 

WAELGSGLEFPGAEGQWTRFPKWNACCESEMSF 

QLKTRSARGLVLYFDDEGFCDFLELILTRGGRLQ 

LSFSIFCAEPATLLADTPVNDGAWHSVRIRRQFR 

NTTLFir)QVEAKWVEVKSKRRDMTVFSGLFVGG I 

LPPELRAAALKLTLASVREREPFKGWIRDVRVNS 

SQVLPVDSGEVKLDDEPPNSGGG\SPCEAGEEGE 

GGVCLNGGVCSWDDQAVCDCSRTGFRGKDCS 

QEDNNVEGLAHLMMGDQGKEEYIATFKGSEYF 

CYDLSQNPIQSSSDEITLSFKTLQRNGLMLHTGKS 

ADYVNLALKNGAVSLVINLGSGAFEALVEPVNG 

KFlTONAWHDVKVTRmRQHSGIGHAMVTISVD 

GILTTTGYTQEDYTMLGSDDFFYVGGSPSTADLP 

GSPVSNNFMGCLKEVVYKNNDVRLELSRLAKQ 

GDPKJVQOHGVVAFKCENVATLDPITFETPESFISL 

PKWNAKKTGSISFDFRTTEPNGLELFSHGKPRHQ 

KDAKHPQMIKVDFFAIEMLDGHLYLLLDMGSGT 

IKIKALLKKVNDGEWYHVDFQRDGRSGTISVNT 

LRTPYTAPGESEILDLDDELYLGGLPENKAGLVF 

PTEVWTALLNYGYVGCIRDLFIDGQSKDIRQMA 

EVQSTAGVKPSCSKETAKPCLSNPCKNNGMCRD 

GWNRYVCDCSGTGYLGRSCEREATVLSYDGSM 

FMKIQLPWMHTEAEDVSLRFRSQRAYGILMAT 

TSRDSADTLRLELDAGRViaTVNLDCIRINCNSS 

KGPETLFAGYNLND>3EWHTVRVVRRGKSLKLT 

VDDQQAMTGQMAGDHTRLEFHNIETGniERRY 

LSSVPSNFIGHLQSLTFNGMAYIDLCKNGDIDYC 

ELNARFGFRNIIADPVTFKTKSSYVALATLQAYT 

SMHLFFQFKTTSLDGLILYNSGDGNDFIWELVK 

GYLHYVFDLGNGANLIKGSSNKPLNDNQWHNV 

MISRDTSNLHTVKIDTKJTTQITAGARNLDLKSDL 

YIGGVAKETYKSLPKLVHAKEGFQGCLASVDLN 

G\RLP\DLISDGSFSCNGTDSRRGMWKGPSTnCQ 
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SEQID 
NO: 



3502 



3503 



Method 



Predicted 
beginaing 
nuekotide 
locatioii 
I corresponding 
to first amino 
acid residue of 
peptide 
seqnence 



Predicted end 
nncleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 



394 



72 



43 



3358 



3504 



1124 



139 



Amino acid sequence (A=Alanine C=Cysteine, l)=As^rnc auu, 
E=GlaUmic Acid, F=PiienyIaIanine, G=GIycine, H=iristidine, 
l=IsoIeucine,K=Lysine,L=Uncine,M=Me>Jilomne, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Senne, 
T=Tlireonine, V=VaUnc W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, HMwaW* nudeofide deletion, 
\Fpossible nucleotide insertion 



EDSCSNQGVCLQQWDGt'SCUCSMTSFSGPLCWD 

PGTTYIFSKGGGQrTYKWPPNDRPSTOADRLAIGF 

STVQKEAVLVRVDSSSGLGDYLELHIHQGKIGVK 

FNVGTDDIAIEESNAIINDGKYHWRFrRSGGNA 

ILQVDSWPVBERYPAGRQLTIFNSQATmGGKEQ 

GOPFQGQLSGLYYNGLKVLNMAAENDANIAIVG 

NVRLVGEVPSSl^OTESTATAMQSEMSTSIMETTT 

TLATSTARRGKPPTKEPISQTTDDILVASAECPSD 

DEDIDPCEPSSGGLANPTRAGGREPYPQSAEVIRE 

SSSTTGMVVGrVAAAALCILILLYAMYKYRNRDE 

GSYHVDESKNYISNSAQSNGAVVKEKQPSSAKSS 

NKNKKJJKDKEYYV 



KPAHLPFTVnMPKRKPSEGAMSDKVKA/KFELQ 
RRSAGLFSKPTPPKPETRPKKDPANQRQKLPKVR 
KGKADA/SKEGNSPAEERCSMVQTQKVEGWRSG 



SELPVALSF 



SGGRGPVRVRSEQLSPSAEQVSQISQlSLOKKi'LS 
SLPPPPSRALAPTRAPDTALTIMEVAEVESPLNPS 
CKIMTITIPSMEEFREFNKYLAYMESKGAHRAGL 
AKVIPPKEWKPRQCYDDIDNLLIPAPIQQMVTGQ 
SGLFTQYNIQKKAMTVKEFRQLANSGKYCTPRY 
LDYEDLERKYWKNLTFVAPIYGADINGSIYDEGV 
DEWNIARLNTVLDWEEECGISIEGVNTPYLYFG 
MWKTTFAWHTEDMDLYSINYLHFGEPKSWYAIP 
PEHGKRLERLAQGFFPSSSQGCDAFLRHKMTLIS 
PSVLKKYGffFDKITQEAGEFMITFPYGYHAGFN 
•HGFNCAESTNFATVRWIDYGKVAKLCTCRKDM 
'VKISMDIFVRKFQPDRYQLWKQGKDIYTIDHTKP 
TPASIPEVKAWLQRRRKVRKASRSFQCARSTSK 
RPKADEEEEVSDEVDGAEVPNPDSVTDDLKVSE 
KSEAAVKLRNTEASSEEESSASRMQVEQNLSDHl 
KLSGNSCLSTSVTEDIKraDDKAYAYRSVPSISSE 
ADDSIPLSTGYEKPEKSDPSELSWPKSPESCSSVA 
ESNGVLTEGEESDVESHGNGLEPGEIPAVPSGER 
NSFKVPSIAEGENKTSKSWRHPLSRPPARSPMTL 
VKQQAPSDEELPEVLSIEEEVEETESWAKPLIHL 
WOTKPPNFAAEQEYNATVARMKPHCAICTLLMP 
YKKPDSSNEENDARWETCLDEWTSEGKTKPLIP 
EMCFIYSEENIEYSPPNAFLEEDGTSLLISCAKCC 
VRVHASCYQIPSHEICDQWLCARCKRNAWTAEC 
CLCNLRGGALKQTKNNKWAHVMCAVAVPEVR 
FTNVPERTQIDVGRIPLQRLKLKCIFCRHRVKRVS 
GACIQCSYGRCPASFHVTCAHAAGVL\MEPDDW 
PYVmXCFRHKVNPNVKSKACEKVISVGQTVlT 
KHRNTRYYSCRVMAVTSQTFYEVMFDDGSFSRD 
TFPEDIVSRDCLKLGPPAEGEWQVKWPDGKLY 
GAKYFGSNL^HMYQVEFEDGSQIAMKEHJIYTL 
DEELPKRVKARFVSAGRCHLGTCQVNSLSSPHVS 
OAQQETYLGFWINSKKSQCNIFLSGTY 



RGEEOFDAEFRRFACLGFGERLQEFSRLLKAVriR 
SRAWTCYLAIRMLMATCCPSPTTTACTGPWQRA 
PPLRLLVQKREADSSGLAFASNSLQRRKKGLLLR 
PVAPLRTKPPLLISLPQDFRQySSVIDVDLLPETH 
RRVRLHKHGSDRPLGFYIRDGMSVRVAPQG\LER 
vpnTFISRLVRGGLAESTGLLAVSDEILEVNGIEV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AlanineOCysteine, D=Aspartic Acid, 
E^GIutamic Acid, F=Phenylalamne, G=Glycine, H=IIistidine, 
I^Isoleucine, K«Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P=ProHne, Q^Glutaraine, R^^Arginine, S=Serinc, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyroslne, 
X=Unlmown, *^top codon, A=possib1e nocteotide deletion, 
H>ossible nucleotide insertion 










AGKTLNQVTDMMVANSHN\LIVT\nKPA^ 
WRGASGRLTGPPSAGPGPAEPDSDDDSSDLVIE 
NRQPPSSNGLSQGPPCWDLHPGCRHPGTRSSLPS 
LDDQEQASSGWGSRIRGDGSGFSL 


3505 


A 


3 


2898 


SCRSATSQSGCGGGRSWLCSSLKMAAQPPRGIRL 

SALCPKFLHTNSTSHTWPFSAVAELIDNAYDPDV 

NAKQIWIDKTVINDHICLTFTDNGNGMTSDKLH 

KMLSFGFSDKVTMNGHVPVGLYGNGFKSGSMVR 

LGKDAIVFTKNGESMSVGLLSQTYL\EVIKAEHV 

VWIVAFNKHRQMINLAESKASLAAILEHSLFSTE 

QKLLAELDAIIGKXGTRinWNLRSYKNATEFDFE 

KDKYDIRIPEDLDEITGKKGYKKQERMDQIAPES 

DYSLRAYCSILYLKPRMQIILRGQKVKTQLVSKS 

LAYIERDVYRPKFLSKTVRITFGFNCRNKDHYGI 

MMYHRI^IKAYEKVGCQUIANNMGVGW 

ECNFLKFTHNKQDFDYTNEYRLTITALGEKLND 

YWNEMKVKKNraYPLNLPVEDIQKRPDQTWVQ 

CDACLKWRKLPDGMDQLPEKWYCSNNPXDPQFR 

NCEVPEEPEDEDLVHPTYEKTYKKTNKEKFRJRQ 

PEMIPRINAELLFRPTULSTPS\FSSPKESVSKJl^ 

LSEGTNSYATRLLNNHQVPPQSEPESNSLKRRLS 

TRSSILNAKNRRL\SSQF\ENSVYKG\DDDDEDVII 

LEENSTPKPAVDHDIDMKSEQSHVEQGGVQVEF 

VGDSEPCGQTGSTSTSSSRCDQGNTAATQTEVPS 

LVVKKEETVEDEIDVRNDAVILPSCVEAEAKIHE 

.TQETTDKSADDAGCQLQELRNQLLLVTEEKENY 

JSkJ\.l^L«rllVJJ7 V l^V^V^IUJ^JZiiVllNl^JV I V JMnJD 1 l^JCl 

QSTETDAVFLLESINGKSESPDHMVSQYQQALEE 

ffiRLKKQCSALQHVKAECSQCSNNESKSEMDEM 

AVQLDDVFRQLDKCSIERDQYKSEVELLEMEKS 

QIRSQCEELKTEVEQLKSTNQQTATDVSTSSNIEE 

SVNHMDGESLKLRSLRVNVGQLLAMIVPDLDLQ 

QVNYDVDWDEILGQWEQMSEISST 


3506 


A 


2 


2120 


RPPEAGGRYRAGGRRQAAKPSRPPLPSRRRLPQG 

GRTRRAMDRPAAAAAAGCEGGGGPNPGPAGGR 

RPPRAAGGATAGSRQPSVETLDSPTGSHVEWCK 

QLIAATISSQISGSVTSENVSRDYKALRDGNKLA 

QMEEAPLFPGESIKAIVKDVMYICPFMGAVSGTL 

TVTDFKLYFKNVERDPHFILDVPLGVISRVEKIGA 

QSHGDNSCGIEIVCKDMRNLRLAYK\QEEQSKLG 

IFE>a,NKHAFPLSNGQALFAFSYKEKFPINGWKV 

YDPVSEYKRQGLPNESWKISKINSNYEFCDTYPA 

nWPTSVKDDDLSKVAVFLAKGRVPVLSWIHPE 

SQATITRCSQPLVGPNDKRCKEDEKYLQTIMDAN 

AQSHKLIIFDARQNSVADTNKTKGGGYESESAYP 

NAELVFLEIHNIHVMRESLRKLKEIVYPSIDEARW 

LSNVDGTHWLEYIRMLLAGAVRIADKJESGKTSV 

WHpQnnwn'RTAnT t^t amt mi n^WRTTK^riFP 

V V riooi^ w W l^Jx 1 rW^l^ 1 0 JU/VLVJJUlVJULiJLI O I I Ix. 1 UVVJr' Ct 

TLVEKEWISFGHRFALRVGHGNDNHADADRSPIF 

LQFVDCVWQMTRQFPSAFEFNELFLITILDHLYS 

CLFGTFLCNCEQQRFKEDVYTKTISLWSYINSQL 

DEFSNPFFVNYENHVLYPVASLSHLELWVNYYV 

RWNPRMRPQMPIHQNLKELLAVRAELQKRVEG 

LQREVATRAVSSSSERGSSPSHFATSVHTLV 


3507 


A 


1 


2169 


GSSIKIRLTVLCAKNLAKKDFFRLPDPF\AKIVVD 
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SEQID 

I NO: 



Method 



Predicted I Predicted ena 

beginning nucleotide 

I nHcleotlde location 

location corresponding 

corresponding to last amino 

to first amino acid residue of 

I acid residue of peptide 



peptide 



sequence 



3508 



6388 



Amino acid sequence (A-Alanine C^Cysteine, D=Asparnc ac.«^ 
E=Giutamic Acid, F=PhenylaIamne, G=Glycine, H^Histidine, 
I=IsoIeucine, K=Lyslne, I/=Lencine, M^Mcthioninc, 
N=Asparagine, P-Proline, Q-Glutamme, R^Ai^mne. S=Serine, 
T=Threonine, V-Valine, W«Tryptophan, Y-Tyj-osine. 
X=Unknown, *=Stop codon,/i-possiblc nucleotide deletion, 
\=possible nucleotide insertion 



GSGQCHSTDTVKNTLDPKWNQHYDLyVOKlUSI 
TISVWNHKKIHKKQGAGFLGCVRLLSNAISRLKD 
TGYQRLDLCKLNPSDTDAVRGQIWSLQTRDRIG 
TGGSWDCRGLLENEGTVYEDSGPGRPLSCFME 
EPAPYTDSTCAAAGGGNCRFVESPSQDQRLQAQ 
M.RNPDVRGSIX3TPQNRPHGHQSPELPEGYEQRT 
TVOGOVYFLHTQTGVSTWHDPRIPRDLNSVNCD 
ELGPLPPGWEVRSTVSGRIYFVDHNNRTTQFTOP 
RLHHIMNHQCQLKEPSQPLPLPSEGSLEDK^A 
ORYERDLVQKLKVLRHELSLQQPQAGHCRIEVS 
REEIFEESYRQIMKMRPKDLKKRLMVKFRGEEG 

Syggvarewlyllchemlnpyyglfqystdni 

YMLQIOTDSSIOTDHLSYFHFVGRMGLAVFHGH 
YINGGFTVPFYKQLLGKPIQLSDLESVDPELHKSL 
i VWILENDITPVLDHTFCVEHNAFGRILQHELKPN 
G\B1WVTEENKKEYVRLYVNW]RFMRGIEAQFL 
ALOKGFNELIPQHLLKPFDQKELELIIGGLDKIDL 
NHV^NmiOHCVADSNIVRWFWQAVETFDEE 
RRARLLQFVTGSTRVPLQGFKALQGSTGVAAGPR 
LF^IDANTDNLRKAHTCFNRIDIPPYESYEKL 

VFKT .1 .TA VEETCGFAVE 

ILYINPADLGWNPPVSS Wlbl<J<big 1 bRANLTILF 

DKYLPTCLDTLRTRFKKIIPIPEQSMVQMVCHLLE 

CLLTIEDIPADCPKEIYEHYFVFAAIWAFGGAMV 

ODQLVDYRAEFSKWWLTEFKTVKFPSQGTIFDY 

YTOPETKKFBPWSKLVPQFEFDPEMPLQACLVHT 

SEmVCYFMERLMARQRPVMLVGTAGTGKSVL 

VGAkLASLDPEAYLVKNVPFNYYTTSAML^AVL 

EKPLEKKAGRNYGPPGNKKLIYFIDDMNMPEVD 

AYGTVOPHTIIRQHLDYGHWYDRSKLSLKEITNV 

QYVSO^AGSFTINPRLQRHFSVFVLSFPGAJ) 

ALSSIYSnLTQHLKLGNFPASLQKSIPPLIDLALAF 

HOKUTrFLPTOIKFHyiFNIJRDFAMFQGE.FSSV 

ECVKSTWDLIRLYLHESNRVYRDKMVEEKDFDL 

FDKIQTEVLKKTFDDffiDPVEQTQSPNLYCHFAN 

GIGEPKYMPVQSWELLTQTLVEALENHNEVNTV 

MDLVLFEDAMRHVCfflNRILESPRGNALLVGVG 

GSGKQSLTRLAAFISSMDVFQITLRKGYQIQDFK 

MDLASLCLKAGVKNLNTVFLMTDAQVADERFL 

nSdllasgeipdlysddeveot 

GLVDNRENCWKFFIDRIRRQLKVTLCFSPyGNKL 
RVRSRKFPAIWCTAIHWFHEWPQQAIXSySLRF 
LONTEGIEPTVKQSISKFMAFVHTSVNQTSQSYLS 
mYNYTTPKSFLEFIRLYQSLLHRHRKELKCK 
TEW.ENGLLKLHSTSAQVDDLKAKLAAQEVmJK: 
OKNEDADKLIQWGVETDKVSREKAMADEEEQ 
KVAVIMLEVKQKQKDCEEDLAKAEPALTAAQA 
ALNTLNKTNLTELKSFGSPPLAVSNVSAAVMVL 
MAPRGRVPKDRSWKAAKVIMAKVDGFIUDSLIN 
FNKENIHENCLKAIRPYLQDPEFNPEFVATKSYA 
AAGLCSWVINIVRFYEVFCDVEPKRQALNKATA 
DLTAAOEKLAAIKAmHLNENLAKLTARFEKA 
TADKLKCQQEAEVTAVTISI^^NRLVGGLASENy 
RWADAVQOTKQQERTIX:GDEiITAnSYLGFFT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (AaAlanfne C^CysteinCi D^Aspartic Acid, 
i>=G]utamjc Add, F=Pheny]alanine, G=Glycine, H=HistjdiDe, ' 
I^lsoleucine, K-Lysine, Lr=Leudne, M=Methionine, 
N=Asparagine, P=Prolinc, Q=Glutamine, R'^Arginine, S^sSerinc, 
T=Threoninc, V«VaHne, W=Tryptophan, Y^Tyrosinc, 
X'^'Unknown, *=Stop codon, A^possible nndeotide deletion, 
\=pQssible nudeotide insertion 




H' J' . r. 






LMDDADVAAWQ>JEGLPADRMSVENATILINCE 

RWPLMVDPQLQGKWIKNKYGEDLRVTQIGQKG 

YLQIIEQALEAGAWLIENLEESIDPVLGPLLGRE 

VIKKGRFIKIGDKECEYhn>KFRL]LHTKLANPHYQ 

PELQAQATLINFTVTRDGLEIXJLLAAWSMERP 

DLEQLKSDLTKQQNGFKITLKTLEDSLLSRLSSAS 

GNFLGETVLVENLEITKQTAAEVEKKVQEAKVT 

EVKINEAREHYRPAAARASLLYFIMhTOLSKIOT 

YQFSLKAFSIVFQKAVERAAPDESLRERVANLID 

SITFSVYQYTIRGLFECDKLTYLAQLTFQILLMNR 

EVNAVELDFLLRSPVQTGTASPVEFLSHQAWGA 

VKVLSSMEEFSNLDRDffiGSAKSWKKFVESECPE 

KEKLPQEWKNKTALQRLCMLRAMRPDRMTYAL 

RDFVEEKLGSKYWGRALDFATSFEESGPAIPMF 

FELSPGVDPLKDVESQGRKLGYTFNNQNFHNVSL 

GQGQEWAEAALDLAAKKGHWVILQNTLEMCS 

RETEFKSILFALCYFHAWAERRKFGPQGWNRSY 

PFNTGDLTISVNVLYNFLEANAKVPYDDLRYLFG 

EIMYGGHITDDWDRRLCRTYLGEFIRPEMLEGEL 

SLAPGFPLPGNMDYNGYHQYIDAELPPESPYLYG 

LHPNAEIGFLTQTSEKLFRTVLELQPRDSQARDG 

AGATREEKVKALLEEILERVTDEFNffELMAKVE 

ERTPYIWAFQECGRMNILTREIQRSLRELELGLK 

nX!l TTVyTTCUTKyfCXTT i^TvT A T V"Cr\A/f\/'D'CC\X/ A D tJ AVTDC 

UcLr i JVL 1 oJrtlVlblNl-ri^IN AL, Y r JJM V riSo W AKKA Y ro 

TAGLAAWFPDLLNRIKELEAWTGDFTMPSTVWL 

TGFFNPQSFLTAIMQSTARKNEWPLDQMALQCD 

MTKKNREEFRSPPREGAYIHGLFMEGACWDTQA 

GniEAKLKDLTPPMPVMFIKAIPAD\RQDCGHVY 

SCPVTKTSQ\RDPTYVWTFNLKTKENPSKWVLA 

GVALLLQI 


3509 


A 


3 


6388 


ELYINPADLGWNPPVSSWIEKREIQTERANLTILF 

DKYLPTCLDTLRTRFKKnPIPEQSMVQMVCHLLE 

CLLTTEDIPADCPKEIYEHYFVFAAIWAFGGAMV 

QDQLVDYRAEFSKWWLTEFKTVKFPSQGTIFDY 

YIDPETKKFEPWSiaVPQFEFDPEMPLQACLVHT 

SETIRVCYFMERLMARQRPVMLVGTAGTGKSVL 

VGAKLASLDPEAYLVKNVPFNYYTTSAMLQAVL 

EKPLEKXAGRNYGPPGNKKLIYFIDDMNMPEVD 

AYGWQPHTnRQHLDYGHWYDRSKLSLKEITNV 

QYVSCMNPTAGSFTINPRLQRHFSVFVLSFPGAD 

ALSSIYSnLTQHLKLGNFPASLQKSIPPLIDLALAF 

HQKIATTFLPTGIKFHYIFNLRDFANIFQGILFSSV 

ECVKSTWDLIRLYLHESNRVYRDKMVEEKDFDL 

FDKIQTEVLKKTFDDDEDPVEQTQSPNLYCHFAN 

GIGEPKYMPVQSWELLTQTLVEALENHNEVNTV 

MDLVLFEDAMRHVCHINRILESPRGNALL VGVG 

GSGKQSLTRLAAFISSMDVFQITLRKGYQIQDFK 

MDLASLCLKAGVKNLNTVFLMTDAQVADERFL 

VLINDLLASGEIPDLYSDDEVENIISNVRNEVKSO 

GLVDNRENCWKFFroRIRRQLKVTLCFSPVGNKL 

RVRSRKFPArWCTAmWFHEWPQQALESVSLRF 

LQNTEGDEPTVKQSISKFMAFVHTSVNQTSQSYLS 

NEQRYNYTTPKSFLEFIRLYQSLLHRHRKELKCK 

TERLENGLLKLHSTSAQVDDLKAKLAAQEVELK 

QKNEDADKLIQWGVETDKVSREKAMADEEEQ 
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SEQID J> 
NO: 


4ethod i 
\ 
t 
I 

i 
{ 

i 


Predicted I 
>eginning i 
ludeotide 1 
ocation c 
lorresponding t 
0 first amino i 
icid residue of ] 
)eptide i 
ieqaence 


Predicted end ^ 
lucleotide ^ 
ocation ^ 
.orrcsponding f 
0 last amino 
icid residue of ^ 
[)eptide ^ 
sequence 


^mino acid sequence (A-AIan.ne C^Cysleine, D-As^^ic Acia, 
^«Glutaniic Acid, F=Phenylalaninc, G^GIyanc, H=BDstidine, 
=boleacine,K=Lyslne,L=Uucine,M=MeUiioniDe, 
«l=Asparagine, P=Proline, Q=Glutaiiiine, l^Argiiime. S-Serme, 
r=Threoiime, V=ValiBe, W=^i7ptoph«n, V-'^TOdiie, 
iC=Unknowii, •=Stop codoB. A=possiWe nneleotide deletaon, 
ppossible nudeofide insertion 










KVAVIMLEVKQKQKDCEHDLAKAEPALTAAyA 

ALNTLNKTNLTELKSFGSPPLAVSNVSAAVMVL 

MAPRGRVPKDRSWKAAKVTMAKVDGFLDSLIN 

FNKENIHENCLKAIRPYLQDPEFNPEFVATKSYA 

AAGLCSWVmiVRFYEVFCDVEPKRQALNKATA 

DLTAAQEKLAAKAKIAHLNENLAKLTARFEKA 

TADKLKCQQEAEVTAVTISLANRLVGGLASENV 

RWADAVQNFKQQERTLCGDILLITAFISYLGFFT 

KKYRQSLLDRTWRPYLSQLKTPIPVTPALDPLRM 

LMDDADVAAWQNEGLPADRMSVENAmiNCE 

RWPLMVDPQLQGKWIKNKYGEDLRVTQIGQKG 

YLOnEQALEAGAWLIENLEESIDPVLGPLLGRE 

VKKGRFIKIGDKECEYNPKFRLILHTKLANPHYQ 

PELQAQATLINFTVTRDGLEDQLLAAWSMERP 

DLEQLKSDLTKQQNGFKITLKTLEDSLLSRLSSAS 

GNFLGETVLVENLEITKQTAAEVEKKVQEAKVT 

EVKINEAREHYRPAAARASLLYFIMNDLSKIHPM 

YOFSLKAFSIVFQKAVERAAPDESLRERVANLID 

SrrFSVYQYTIRGLFECDKLTYLAQLTFQILLMNR 

EVNAVELDFLLRSPVQTGTASPVEFLSHQAWGA 

VKVLSSMEEFSNLDRDIEGSAKSWKKFVESECPE 

KEBCLPQEWKNKTALQRLCMLRAMRPDRMTYAL 

RDFVEEKLGSKYWGRALDFATSFEESGPATPMF 

FDLSPGVDPLKDVESQGRKLGYTFNNQNFHNVSL 

GOGOEWAEAALDLAAKKGHWVILQNTLEMCS 

BETEFKSILFALCYFHAWAERRKFGPQGWNRSY 

PFNTGDLTISVNVLYNFLEANAKVPYDDLRYLFG 

EIMYGGHITDDWD&RLCRTYLGEFIRPEMLEGEL 

SLAPGFPLPGNMDYNGYHQYIDAELPPESPYLYG 

LHPNAEIGFLTQTSEKLFRTVLELQPRDSQARDG 

AGATREEKVKALLEEILERVTDEFNIPELMAKVE 

ERTPYIWAFQECGRMNILTREIQRSLRELELGLK 

GELTMTSHMENLQNALYFDMVPESWARRAYPS 

TAGLAAWFPDLLNRKELEAWTGDFTMPSTVWL 

TGFFNPQSFLTAIMQSTARKNEWPLDQMALQCD 

MTKKNREEFRSPPREGAYIHGLFMEGACWDTQA 

GnXEAKLKDLTPPMPVMFIKAIPADVRQDCGHVY 

SCPVTKTSQVRDFTYVWTFNLKTKENPSKWVLA 

GVALLLOI 


3510 


A 


390 


3330 


" AAGSGSRPPAPAARKMAULAECNlKVMUKfKi-i. 
NESEVNRGDKYIAKFQGEDTWIASKPYAFDRVF 
OSSTSQEQVYNDCAKKIVKDVLEGYNGTIFAYG 
OTSSGKTHTMEGKLHDPEGMGIIPRIVQDIFNYFSf 
SMDENLEFHKVSYFEIYLDKIRDLLDVSKTNLSV 
HEDKNRVPYVKGCIERFVCSPDEVMDTIDEGKS 
1.JRHVAVTNMNEHSSRSHSIFLINVKQENTQTEQK 
LSGKLYLVDLAGSEKVSKTGAEGAVLDEAKNIN 
KSLSALGNVISALAEGSTYVPYRDSKMTRILQDS 
LGGNCRTTIVICCSPSSYNESETKSTLLFGQRAKTI 
wrmrrAnJVTJl TAFOWKKKYEKEKEKNKILRNTI 
QWLENELNRWRNGETVPIDEQFDKEKANLEAFT 
VDKDITLTNDKPATAIGVIGNFTDAERRKCEEEIA 
KLYKQLDDKDEEINQQSQLVEKLKTQMLDQEEL 
LASTRRDQDNMQAELNRLQAENDASKEBVKEV 
LOALEELAVNYDQKSQEVEDKTKEYELLSDELN 



365 



wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C=Cysteine, D^Aspartic Acid, 
EXSIutamic Add, F=Phenyl alanine, G«=Glydne, H=Histidine, 
I-Iso!eucine, K^^Lysine, I/=Leudne, M-Methionine, 
N^Asparagine, P'=Proiine, Q=GJutamine, R=Arginine, S^Serioe, 
T^Threonine, V=Va!ine, W=Tryptophan, Y=Tyrosinc 
X»Unknown, ^'^Stop codon, ^possible nudeotide deletion, 
\-possible nudeotide insertion 










QKSATLASmAELQKLKEMThraQKKRAAEMMA 

SLLKDLAEIGIAVGNNDVKQPEGTGMIDEEFTVA 

RLYISKMKSEVKTIvrvrKRCKQLESTQTESNKKME 

ENEKELAACQLiUSQHEAKIKSLTEYLQNVEQKK 

RQLEESVDALSEELVQLRAQEKVHEMEKEHLNK 

VQTANEVKQAVEQQIQSHRETHQKQISSLRDEVE 

AKAK1.ITDLQDQNQKMMLEQERLRVEHEKLKA 

TDOEKSRKT HFT TVMODRRFOARODT KGT FETV 

AKELQTLHNLRKLFVQDLATRVKKSAEIDS\DDT 

GGSAAQKQKISFLENNLE\QLTKSAQTSWYRDNA 

DLRCELPKLEKRLRATAERVKALESALKEAKEN 

ASRDRKRYQQEVDRIKEAVRSKNMARRGHSAQI 

AKPIRPGQHPAASPTHPSAIRGGGAFVQNSQPVA 

VRGGGGKQV 


3511 


A 


1 


1757 


MASVQASRRQWCYLCDLPKMPWAMVWDFSEA 

VCRGCVNFBGADRIELLIDAARQLKRSHVLPEGR 

SPGPPALKHPATKDLAAAAAQGPQLPPPQAQPQP 

SGTGGGVSGQDRYDRATSSGRLPLPSPALEYTLG 

SRLANGLGREEAVAEGARRALLGSMPGLMPPGL 

LAAAVSGLGSRGLTLAPGLSPARPLFGSDFEKEK 

QQRNADCLAELNEAMRGRAEEWHGRPKAVREQ 

LLALSACAPFNVRFKKDHGLVGRVFAFDATARP 

PGYEFELKLFTEYPCGSGNVYAGVLAVARQMFH 

DALREPGKALASSGFKYLEYERRHGSGEWRQLG 

ELLTDGVRSFREPAPAEALPQQYPEPAPAALCGP 

-Qij^RHW^GGPYSAETPiSVPSPIAALKl^ 
LGHSPKDPGGGGGPVRAGGASPAASSTAQPPTQ 
HRLVARNGEAEVSPTAGAEAVSGGGSGTGATPG 
APLaCTLCRERLEDTHFVQ\CPPVPEHKFCFPCSR 
KFIKAQGPAGEWYCPSGDKCPLVGSSVPWAFMQ 
GEIATILAGDIKVKKERDP 


3512 


A 


3 


1994 


NTKSSSVTNSAAGVEDLNIVQVTVPDNEBCERLSS 

lEKKQLREQVNDLFSRKFGEAIGVDFPVKVPYR 

KITF>n>GCVVIDGMPPGVVFKAPGYLEISSMRRIL 

EAAEFIKFTVIRPLPGLELSNGEYSTVGKRKIDQE 

GRVFQEKWERAYFFVEVQNISTCLICKRSMSVSK 

EYNLRRHYQThffiSKHYDQYMERMRDEKLHELK 

KGLRKYLLGLSDTECPEQKQVFANPSPTQKSPVQ 

PVEDLAGNLWEKLREKIRSFVAYSIAIDEITDINN 

TTQLAIFIRGVDENFDVSEELLDTVPMTGTKSGN 

EIFSRVEKSLKNFCmWSKLVSVASTGTPPMVDA 

NNGLVTKLKSRVATFCKGAELKSICCIIHPESLCA 

Q\KLKMDHVMDVVVKSVNWICSRGLNHSEFTTL 

LYELDSQYGSLLYYTEIKWLSRGLVLKRFFESLE 

EIDSFMSSRGKPLPOLSSrDWTRDLAFLVDMTMH 

LNALNISLQGHSQIVTQMYDLIRAFLAKLCLWET 

HLTRNNLAHFPTLKLVSRl^SDGLNYIPKIAELK 

TEFQKRLSDFKLYESELTLFSSPFSTKIDSVHEELQ 

MEVIDLQCNTVLKTKYDKVGIPEFYKYLWGSYP 

KYKHHCAKILSMFGSTYICEQLFSIMKLSKTKYC 

SQLKDSQWDSVLfflAT 


3513 


A 


1836 


513 


FKSLLSVKWFCFSILVLIFLGTRCYWEMTQSRPSP 
DPHRGRWEGGRSRPKGGEEGRRRTRVPGLVTAS 
GPGNPLPDRLGEMAGGRHRRWGTLHLLLLVAA 
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SEQID M 
NO: 


Method n 
I 
1 
I 
c 
1 
J 
I 
i 


Predicted pi 
teginning n 
lucleotide 1 
ocatioD c 
corresponding t 
0 first amino s 
icid residue of i 
peptide ! 
lequence 


'redicted end / 
luclcotlde 1 
ocation ^ 
orresponding P 
0 last amino 
icid residue of ^ 
peptide ^ 
sequence 


Lmino acid sequence (A-Alaninc C=Cysteine, U»Aspartic Acid, 
^=r iiitflmic Acid F^Phenylalanine, G=Glycine, H-Histioine, 
=IsoIendne,K=Lysine,L=Leiicine,M=MethiODiiie, 
<=Asparagine, P=Proline, Q=Glutamine, R=Argimne, S-Senne, 
r=Threomne, V=VaUne, W-Tiyptophan, Y=Tyros|iie, 
t=UDknown, *=Stop codon, /t=posdble nucleolide deletion, 
ppossible nudeoUde inseiHoo 










CPWASRGVSPSASAWPEHKNyHQPAIL»SSAUi.V{ 

[AEGTSISEMWQMDLQPLLIERYPGSPGSYAARQ 

HlMQRIQRLQADWVLEIDTFLSQTPYGYRSFSNn 

STLWTAKRHLVLACHYDSKYFSHWWJRVFVG 

AIDSAVPCAMMLELARALDKKLLSLKTVSDSKP 

DLSLQLEFFDGEEAFLHWSPQDSLYGSRHLAAKM 

ASTPHPPGARGTSQLHGMDLLVLLDLIGAPNPTF 

PNFFPNSARWFERLQAIEHELHELGLLKDHSLEG 

RYFQNYSYGGVIQDDHIPFLRRGVPVLHLIPSPFP 

EVWHTMDDNEENLDESTIDNLNKILQVFVLEYL 


3514 


A 


1836 


513 


FKSLLSVKWFCFSILVLU-LGlKCYWEMTgtJKl'bi' 

DPHRGRWEGGRSRPKGGEEGRRRTRVPGLVTAS 

GPGNPLPDRLGEMAGGRHRRWGTLHLLLLVAA 

LPWASRGVSPSASAWPEEKNYHQPAILNSSALRQ 

lAEGTSISEMWQNDLQPLLIERYPGSPGSYAARQ 

HIMQRIQRLQADWVLEIDTFLSQTPYGYRSFSNn 

SllJOTAKRHLVLACHYDSKYFSHWVNNRVFVG 

ATDSAVPCAMMLELARALDKKLLSLKTVSDSKP 

DLSLQLIFFDGEEAFLHWSPQDSLYGSRHLAAKM 

ASTPHPPGARGTSQLHGMDLLVLLDLIGAPKPTF 

PNFFPNSARWFERLQAIEHELHELGLLKDHSLEG 

RYFQNYSYGGVIQDDHIPFLRRGVPVLHLIPSPFP 

EVWHTMDDNEENLDESTIDNLNKILQVFVLEYL 

HL 




A 


114, 


.754 


LCiUDLTTTOSSKRTKTKTKKKPQRATSNVl-AJVU' 
-DOSQIQEFKEAFNMIDQNRDGFIDKEDLHDMLAS 
L&NPTDEYLDAMNINEAPGPINFTMFLTMFGEK 
LNGTDPEDVIKNAFACFDEEATQTIQEDYLRELL 
1TMGDRFMDE\EVDELYREAPIVDKKGGIFNYI\E 
FTRHLETGGPKDKDDRKITFQIPSPNVPWLATFG 

VFLEIFLLHGP — 


. 3516 


A 


1 


5169 


MAAAPSALLLLPPFPVLSlYRLQSRSRPSAPblUL. 

SRVGGIMRGEKNYYFRGAAGDHGSCPTTTSPLA 

SALLMPSEAVSSSWSESGGGLSGGDEEDTRLLQL 

LRTARDPSEAFQALQAALPRRGGRLGFPRRKEAL 

YRALGRVLVEGGSDEKRLCLQLLSDVLRGQGEA 

GQLEEAFSLALLPQLWSLREENPALRKDALQIL 

HICLKRSPGEVLR11.IQQGLESTDARLRASTALLL 

PILLTTEDLLLGLDLTEVnSLAKKLGDQElEEESE 

TAFSALQQIGERLGQDRFQSYISRLPSALRRHYN 

RRLESQFGSQVPYYLELEASGFPEDPLPCAVTLS 

NSNLKFGIIPQELHSRLLDQEDYKNRTQAVEELK 

OVLGKFNPSSTPHSSLVGFISLLYNLLDDSNFKW 

HGTLEVLHLLVIRLGEQVQQFLGPVIAASVKVLA 

DNKLVIKQEYMKIFLKLMKEVGPQQVLCLLLEH 

LKHKHSRVREEVVNICICSLLTYPSHDFDLPKLSF 

DLAPALVDSKRRVRQAALEAFAVLASSMGSGKT 

SILFKAVDTVELQDNGDGVMNAVQARLARKTLP 

RLTEQGFVEYAVLMPSSAGGRSNHLAHGADTD 

WIXAGNRKJSAHCHCGDHVRDSMHIYGSYSPTI 

CTRRVLSAGKGKNKLPWENEQPGIMGENQTSTS 

KDIEQFSTYDFIPSAKLKLSQGMPVNDDLCFSRK 

RVSRNLFQNSRDFNPDCLPLCAAGTTQ-raQTNLS 

GKCAQLQFSOICGKTGSVGSDLOFLGTTSSHQEK 



367 



wo 01/57190 



PCTAJSOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginniDg 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (ABAIanine C^Cysteine, D~Aspartic Add, 
£>=GIutamic Acid, F«=Pbenylalanine, G«K?iydne, H=ffistidine, 
I'^Isoleudne, K^Lysine, L^^Leucine, M^Methionine, 
N=Asparagjnc P=Prolii3c, Q=GluramjDe, R=Arginjne, S=Serine, 
T^hreonine, V«=VaUne, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nudeotide ddetion, 
\=possibie nucleotide insertion 










VYASLNFGSKTQQTFGSQTECTSSNGQNPSPGAY 

ILPSYPVSSPRTSPKHTSPLnSPKKSQDNSVNFSNS 

WPLKSFEGLSKPKSHRRSLSAQKSS\DPTGR\NHG 

\ENSQEKPP\VQLTPAL\VRSPSSRRGLNGTKPVPPI 

P\RGISLLPDKADLSTVGHKKKEPDDIWKCEKDS 

LPrDLSELNFKDKDLDQEEMHSSLRSLRNSAAKK 

RAKLSGSTSDLESPDSAMKLDLTMDSPSLSSSPNI 

NSYSESGVYSQESLTSSLSTTPQGKRIMSDIFPTFG 

SKPCPTRLSSAKKKISHIAEQSPSAGSSSNPQQISS 

FDFTTTKALSEDSVVVVGKGVFGSLSSAPATCSQ 

SVISSVENGDTFSIKQSIEPPSGIYGRSVQQNISSYL 

DVENEKDAKVSISKSTYNKMRQKRKEEKELFHN 

KDCEKKEKNSWERMRHTGTEKMASESETPTGAI 

SQYKERMPSVTHSPEIMDLSELRPFSKPEIALTEA 

LRLljyDEDWEKKffiGLNmCUU\FHSEILNTKL 

HEThn^AWQEVKNLRSGVSRAAVVCLSDLFTYL 

KKSMDQELDTTVKVLLHKAGESNTFIREDVDKA 

LRAMVNNVTPARAVVSLINGGQRYYGRKMLFF 

MMCHPNFEKMLEKYVPSKDLPYIKDSVRNLQQK 

GLGEIPLDTPSAKGRRSHTGSVGNTRSSSVSRDA 

FNSAERAVTEVREVTRKSVPRNSLESAEYLKLIT 

ni T MAK'nFpnprKjnrKTiT t ^nTTTKrwoni wmsnv 

VJ J-»JjIN/xRJL/Jr J^X'IvIlN VJIJVk^I-tl^ol^ 1 XJIN IN V^JLIi-f V V \ JlN 1 V 

KIFDAFKSRLHDSNSK\^VALETMHKMIPLLRD 

HLSPIINMLIPAIVDNNLNSKNPGIYAAATNVVQA 

LSQHVDNYLLLQPFCTKAQFLNGBCAKQDMTEKL 

ADIVTELYQRKPHATEQKVLVVLWHLLGNMTN 

SGSLPGAGGNIRTATAKLSKALFAQMGQNLLNQ 

AASQPPHIKKSLEELLDMTILNEL 


3517 


A 


1449 


252 


QDLKPVLDREYLAIYLKMVFFTCNACGESVKKI 

QVEKHVSVCRNCECLSCIDCGKDFWGDDYKNH 

VKCISEDQKYGGKGY/EKVKTHKGD/ASKQQAW 

IQKISELIK\RPNVSPKVRELLEQISAFDNVPQ\KK 

AKFQNWMKNSLKVHNESILDQVWNIFSEASNSE 

PVNKEQDQRPLHPVANPHAEISTKVPASKVKDA 

NHQENSRNQKPKKRKKGQEADLEAGGEEVPEA 

NGSAGKRSKKKKQRKDSASEEEARVGAGKRKR 

RHSKVETDSKKKKMKLPEHPEGGEPEDDEAPAK 

GKFhTWKGTIKAILKQAPDmTIKKIJa^ 

YTVTDEHHRSEEELLVIFNKKISKNPTFI^ 

VKLVK 


3518 


A 






IJVALMASPSKAVIWGNGGGDVTTHGWYGWVK 

KELEKPGFQCLAKNMPDPITARESIWLPFNffiTEL 

HCDEKTinGHSSGAIAAMRYAETHRVYAIVLVSA 

YTSDLGDENERASGYFTRPWQWEKIKANCPYIV 

QFGSTDDPFLPWKEQQEVAD\SWKPNCTNSLTV 

ATFRTQSFMN 


3519 


A 


81 


2277 


VRETRREMAMAMSDSGASRLRRQLESGGFEARL 

YVKQLSQQSDGDRDLQEHRQRIQALAEETAQNL 

KRNVYQNYRQFIETAREISYLESEMYQLSHLLTE 

QKSSLESIPLTLLPAAAAAGAAAASGGEEGVGGA 

GGRDHLRGQAGFFSTPGGASRDGSGPGEEGKQR 

TLTTLLEKVEGCRHLLETPGQYLVYNGDLVEYD 

ADHMAQLQRVHGFLMNDCLLVATWLPQRRGM 
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NO: 


lethod r 
b 
n 
I 
c 
t 
s 

I 

! 


redicted r 
teginning n 
udeotide 1 
Dcation c 
orresponding t 
0 first amino a 
icid residue of i 
peptide a 
iequence 


redicted end ^ 
ucleotide E 
Dcation I 
orresponding ? 
0 last amino 
cid residue of ? 
)eptide V 
eqnence 


:mino acid sequence (A=Alanine C=Cysleine, ^^^^^J^^^^^ 
A^tH IT-Phpnvlalaninc. G^Glycinc, H=Histidine, 

"Isoleudne, K=Lysine, I/=Uucine, IVt=MethioniDe, 
}=Aspa«giDe, P=Proline, (HJIutemine, It^Dine, S=Serine, 
r=Tl.reoiJine, V'Valine, W=Tryptophiin, Y=T3Toane, 
C=Unkiiown, *=Stop codon, A=posslble nucleotide deletion, 
=passible nucleotide insertion 








: 


irRYNALYSIJ)GLAWNVKl)NPPMKDMFKi,LMl' 

PENRIFQAENAKIKREWLEVLED1KRALSEKRRR 

EOEEAAAPRGPPQVTSKATNPFEDDEEEEPAVPE 

VEEEKVDLSMEWIQELPEDLDVCIAQRDFEGAV 

DLLDKLNHYLEDKPSPPPVKELRAKVEERVRQL 

TEVLVFELSPDRSLRGGPKATRRAVSQLIRLGQC 

TKACELFLKNRAAAVHTAIRQLRIEGATLLYIHK 

LCHVFFTSLLETAREFEIDFAGTDSGCYSAFWW 

ARSAMGMFVDAFSKQVFDSKESLSTAAECVKVA 

KEHCQQLGDIGLDLTFnHALLVKDIQGALHSYK 

EfflEATKHKNSEEMWRRMNLMTPEALGKLKEE 

MKSCGVSNFEQYTGDDCWVNLSYTVVAFTKQT 

MGFLEEALKLYFPELHMVLLESLVEULVAVQHV 

DYSLRCEQDPEKKAFIRQNASFLYETVLNPVVEK 

RPFEnVGKPAKOLODLRNASRLIRVNPESlTSW 


3520 


A 


1706 


540 


wTSLAWPWRADGUMiiUGVLNEGFLVIiKUHiV 

HNWKARWFILRQNTLVYYKLEGGRRVTPPKGRl 

LLDGCTITCPCLEYENRPLLIKLKTQTSTEYFLEA 

CSREE«<RDAWAFE«TGAIHAGQARGKVQQLHS 

LRNSFKLPPmSLHRIVDKMHDSNTGIRSSPNMEQ 

QSTYKKTFLGSSLVDWLISNSFTASRLEAVTLAS 

MLMEENFLRPVGVRSMGAIRSGDLAEQFLDDST 

ALYTFAESYKKKISPKEEISLSTVELSGTVVKQGY 

LAKQGHKRKNWKVRRFVLRKDPAFLHYYDPSK 

EENM-VGOFSLRGSLVSALEDNGVPTOVKGNVQ 

GNLFKVnKOmHYYIQAVSSKAEVRAEVWIGSLS 

"KST,NM^^a>PEGTPI)SLPSLPR 


3521 


A 


3 


3063 


HASVSLSLGCPRPCADTPUPQPQPMDLRVOyKl'i' 

VEPPPEPTLLALQRPQRLHHHLFLAGLQQQRSVE 

PMRVKMELPACGATLSLVPSLPAFSIPRHQSQSST 

PCPFLGCRPCPQLSMDTPMPELQEAPQEQELRQL 

LHKDKSKRSAVASSVVKQKLAEmKKQQAALE 

RTVHPNSPGIPYRTLEPLETEGATRSMLSSFLPPV 

PSLPSDPPEHFPLRKTVSEPNLKLRYKPKKSLERR 

KNPLLRKESAPPSLRRRPAETLGDSSPSSSSTPAS 

GCSSPNDSEHGPNPILGSEALLGQRLRLQETSVAP 

FALPTVSLLPAITLGLPAPARADSDRRTHPTLGPR 

GPILGSPHTPLFLPHGLEPEAGGTLPSRLQPILLLD 

PSGSHAPLLTVPGLGPLPFHFAQSLMTTERLSGSG 

LHWPLSRTRSEPLPPSATAPPPPGPMQPRLEQLKT 

HVOVKRSAKPSEKPRLRQIPSAEDLETDGGGPG 

OVVDDGLEHRELGHGQPEARGPAPLQQHPQVLL 

WEOQRLAGRLPRGSTGDTVLLPLAQGGHRPLSR 

AQSSPAAPASLSAPEPASQARVLSSSETPARILPF 

TTGLIYDSVMLKHQCSCGDNSRHPEHAGRIQSIW 

SRLOERGLRSQCECLRGRKASLEELQSVHSERHV 

LLYGTOPLSRLKLDNGKLAGLLAQRMFVMLPCG 

GVGVDTDTIWNELHSSNAARWAAGSVTDLAFK 

VASRELKNGFAWRPPGHHADHSTAMGFCFFNS 

w A T A r-Tj r>T 000<5KASKILIVDWDVHHGNGTQQT 

FYODPSVLYISLHRHDDGNFFPGSGAVDEVGAGS 

GEGFNVNVAWAGGLDPPMGDPEYLAAFRIWM 

PIAREFSPDLVLVSAGFDAAEGHPAPLGGYHVSA 

KCFGYMTQQLMNLAGGAWLALEGGHDLTAIC 

DASEACVAALLGNRVDPLSEEGWKOKPNLNAIR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide [ 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D^Aspartic Acid, 
E>=Glutamic Acid, F=Phenylalanine, <>=Glycine, H^'Histidine, 
I=I$oieucioe, K^Lysine, L^Leucine, M^Methionine, 
N=Asparagine, P=Prolinc, Q==Glutamine, R^Arginine, SHSerine, 
T=Thrconine, V=Valine, W-Tryptophan, Y^^Tyrosine, 
X»Unlcnown, *==Stop codoo, A=possible nndeotide deletion, 
\-possible nucleotide insertion 










SLEAWIRVHSKYWGCMQRLASCPDSWVPRVPG 
ADKEEVEAVTALASLSVGILAEDRPSEQLVEEEE 
PMNL 


3522 


A 


9 


602 


KMAALGEPVRLERDICRAIELLEKLQRSGEVPPQ 
KLQALQRVLQSEFCNAVREVYEHVYETVDISSSP 
EVRANATAKATVAAFAASEGHSHPRVVELPKTE 
EGLGFNIMGGKEQNSPIYISRnP/GGIADRHGGLK 
RGDQLLSVNGVSVEGEHHEKAVELLKAAQGKV 
KLVWYTTKVLEEMESRFEKMRSAKRRQQT 


3523 


A 


645 


1465 


IMAETSLLEAGASAASTAAALENLQVEASCSVCL 
EYLKEPVnECGHNFCKACITRWWEDLERDFPCP 

GMRASAPQHHEALSLFCYEDQEAVCLICAISHTH 

RAHTVVPLDDATQEYKEKLQKCLEAXLNQKLQEI 

TRCKSSEEKKPGELKRLVESRRQQILREFEELHRR 

LDEEQQVLLSRLEEEEQDILQRLRENAAHLGDKR 

RDLAHLAAEVEGKCLQSGFEMLKVRPLPLHSPS 

G 




A 






PM VR HP A ftp A T fr A THnPP VT PTT ^TlPVTP V 

AETCQLAVRRLEWLQQHGGEPAAGPYLSVDPAP 
PAEERVDVGRLREALLDESRPLFERYRAMFALRN 
AGGEEAALALAEGLHCGSALFRHEVGYVLGQLQ 
HEAAVPQLAAALARCTENPMVRHECAEALGAIA 
RPACLAALQAHADDPERVVRE\SCKVALDMYEH 
ETGRAFQYADGLEQLRGAPSLGPNPHPELPEDS 


3525 


A 


-1452 . , 


694.,- ... 


. EGl.QRjeEyLVASAACFQGLAWGGEGRGRAGCS 

-^is^iripfRWAFPT T-T ^PWtPTSIPPM^K'FPT If Wk:'^nYP 

"mtogqlr^^^efwdtapafegrkeiwdalk^ 
aayaaeandhelaqabldgasitlphgtlcecy 
delgnryqlpiyclsppvnlllehteeesleppep 
ppsvrkefplkvrlstgkdvrlsaslpdtvgqlk 
rqlhaqe/gtpkpswqrwffsgklltdrtrlqet 
kiqkdfviqvnnqppppqd 


3526 


A 


123 


3441 


pgneglglaadhnedlghlsadapwpavtmap 

rkrshhglgflccfggsdipeinlrdnhplqfme 

fsspipnaeelnirfaelvdeldltdknreamfal 

ppekkwqiycskkkeqedpnklatswpdyyidri 

nsmaamqslyafdeeetemrnqvvedlktalr 

tqpmrfvtrfielegltcllnflrsmdhatcesri 

htsligcdalmnnsqgrahvlaqpeaistiaqsl 

rtensktkvavleilgavclvpgghkkvlqaml 

hyqvyaaertrfqtllneldrslgryrdevnlk 

taimsfinavlnagagednlefrlhlryeflmlg 

iqpvidklrqhenaildkhldffemvrneddlel 

ARRFDlVrmDTKSASQMFELIHKKLKYTEAYPC 

llsvlhhclqmpykrnggyfqqwqlldrilqqi 

vlqdergvdpdlaplenfnvknivnmlinenev 

kqwrdqaekfrkehmelvsrlerkerecetktl 

ekeenimrtvlnkmkdklaresoelroargova 

elvaqlselstgpvssppppggpltlsssmttndl 

pppppplpfaccppppppplppggpptppgappclg 

mglplpqdpypssdvplrkkrvpqpshplksfnw 

vklneerwgtvwneroomqvfrildledfekm 

fsayqrhqelitnpsqqkelgstediylasrkvk 

elsvidgrraqnciili^klklsneeirqailkmd 
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SEQID I Method 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



3527 



3528 



3529 



1445 



484 



714 



1777 



5684 



Amino acid sequence (A=Alauine C^^yXt^rSrtmS^ 
E«Glutam!c Acid, F«Phenylalanme, G^Glycine, H-Histidine, 
I=.Isoleudne, K^Lysine, L=Leucinc, M=Mcthionine, 
N=i^paragine, P-Proline, Q=Glutamine, R=Argimne, S-Senne. 
Seoninc,V=Valine,\V=TryptophaD,Y=Tyro«n^^^^ 
X«Unknown7*=Stop codon, /possible nudeotide deletion, 
>Fpossible nudeotide insertion 



EQEDLAKDMUiQLLK^KSDIDLL^U^l^R 

mAradrflyemsridhyqqrlqalffkkkfqer 

LAEAKPKVEAILLASRELVRSKRLRQMLEVILAI 

ISLLHYLimEKHFPDBLNMPSELQHLPEAAKVN 
LAELEKEVGNLRRGLRAVEVELEYQRRQ^^PS 
DKWFVMSDFTIVSSFSFSELEDQLNEARDKFA^ 
ALMHFGEHDSmQPDEFFGIFDmQAFSEARQp 
LEAMRRRKEEEERRARMEAMLKEQRERERWQR 
ORKVLAAGSSLEEGGEFDDLVSALRSGEVFDKD 
1 rif I .KRSRKRSGSQALEVTRERAINRLNY 



LLGTRMLAGULEARDPKHG'IH PEDFCFUAUAV 
MEKTAVAAEVLTEDCNTGEMPPLQQQIERLHQE 
J^RQKSLWADVHGKLRSHTOAUIEQI^LREKL 
RALQLQRWKARKKSAASPHAGQESHTLALEPAF 
GKISPLSADEETIPKYAGHKNXQSGHSSWGQRSSS 
NNSAPPKPMSLKIERISSWKTPPQENRDKNLSRR 
RQDRRATPTGRPTPCAERRGWSEDGKVASDTCV 

TLHWPLGKFRFR 



RISKIQVYYSTGYSSRKMNPTLGLAU-LAVLL,! VK 
GLLKPSFSPRNYKALSEVQGWKQRMAAKELAR 
ONMDLGFKLLKKLAFYNPGRNIFLSPLSIST^S 
MLCLGAQDSTLDEIKQGFOTRKMPEKDL^Gra 
YIIHELTQKTQDLKLSIGNTLFIDQRLQPQRKFLE 
DAm^SAETlLlOTQNLEMAQKQINDFI/ESKTH 

qkdSjlienidpgtvi^anyiffrarwkhefdp 

NVTKEEDFFLEKNSSVKVPMMFRSGIYQVGYDD 
KLSCTILElPYQmTAIFILPDEGKLmJEKGLQV 
DTFSRWKTLLSRRVVDVSWRLHMTGTTDLKKT 
LSYIGVSKIFEEHGDLTKIAPHRSLKVGEAVNKA 
ELKMDERGTEGAAGTGAQTLPMETPLWKIDKP 
VT T T rySKKIPSVLFLGKIVNPIGK 



VSSVSHENFl'BVFEDGENPPSSRSSESGKl lit ^ 
OADRTDDIDRELSEGQGAAAIPIGSTSSEIETAST 
VGSEETIIOTPSWTQGTATRSRKTAQKTAMQCC 
LEYVQQFLTRLINL^QNNSFSQSLATEHQGDLG 
REOGETSKWDKNSQGDVKEKNISKQKTSKEYLS 
AFLAACQLFLECSSFPVYIAEGNHTSELRSEKLET 
DCEHVQPPQWLQTLMNACSQASDFSVQSVAISL 

vmTlvgltVamvtgeninsvepaqplsrnqg 

RV^^^PPLTQGW-RYIAEKTOFFKHV^^ 
QLGDGTPQHHQKSVElJYQlJINLVPSSSiaroW 
SOOLTHKDKKIRMEAHAKFAVLWHLTRDLHINK 
S^FVRSFDRSLFIMLDSLNSLDGSTSSVGQAWL 

NQ^LQRTOIARVLEPLLLLLL^^^ 

OAERYWNKSPCYPGEESDKHFMQNFACSNVSQ 

VOLITSKGNGEKPLTMDEIENFSLTVNPLSDRLSL 

LSTSSETIPMVVSDFDLPDQQIEILQSSDSGCSQSS 

AGDNLSYEVDPETVNAQEDSQMPKESSPDDDVQ 

OVVTOLICKWSGLEVESASVTSQLEIEAMPPKC 

SDIDPDEETDaEDDSIQQSQNALLSNESSQn.SVS 

SShecvangisrnssspcisgtthtlhdssvas 

m?SRORSHSSIQFSFKEKLSEKVSEKETIVKESG 
SoSvKLi^DDKKKSSNEKLKQTSV 
^cT.r.TnT.ENWYSCGEGDISEIESDMGSPGSRKSP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaoine C'=Cysteiney D=A5partic Add, 
E>=Glutamic Acid, F'^Fhenylalanine, G=Glydne, H^sHistidine, 
I=Iso]eucine, K-Lysine, L=Lcucinc, M-Methioninc, 
N^-Asparagine, P=Pronne, Q=Glutamine, R»Arginine, S^Serlne, 
T=Threoninc, V=VaIinc, W=Tryptophan, Y«Tyrosine, 
X^Unknown, *=Stop codon, A=possible nncleotide deletion, 
^possible nucleotide insertion 










NFNIHPLYQHVLLYLQLYDSSRTLYAFSAIKAILK 

TOPIAFVNAISTTSVNNAYTPQLSLLQNLLARHRI 

SVMGKDFYSHIPVDSNHNFRSSMYIEILISLCLYY 

MRSHYPTHVKVTAQDLIGNRNMQI^SIEILTLL 

FTELAKVIESSAKGFPSnSDMLSKCKVQKVILHC 

LLSSIFSAQKWHSEKMAGKNLVAVEEGFSEDSLI 

NFSEDEFDNGSTLQSQLLKVLQRLIV\LEHRVM\T 

IPEE\NETGFDFWS\DLEfflSPHQPMTSLQYLHAQ 

SITCQGMFLCAVIRAVLHQHCACKMHPQWIGLIT 

S1LPYMGKVLQRVWSVTLQLCRNLDNLIQQYK 

YETGLSDSRPLWMASIIPPDMILTLLEGITAIIHYC 

LLPPTTQYHQLLVSVDQKHLFEARSGILSILHMI 

MSSVTLLWSILHQADSSEmTIAASASLTTINLG 

ATKNLRQQILELLGPISMNHGVHFMAAIAFVWN 

ERRQNKTTTRTKVIPAASEEQLLLVELVRSISVM 

RAET^aQTVKEVIXQPPAIAKDKKI^LSLEVCm 

QFFYAYIQRIPWNLVDSWASLLILLKDSIQLSLP 

APGQFLILGVLNEFIMKNPSLENKKDQRDLQDVT 

HKIVDAIGAIAGSSLEQTTWLRRNLEVKPSPKIM 

VDGTNLESDVEDMLSPAMETANITPSVYSVHAL 

TLLSEVLAHLLDMWYSDEKERVIPLLVNIMHYV 

VPYLRNHSAHNAPSYRACVQLLSSLSGYQYTRR 

AWKKEAFDLFMDPSFFQMDASCVNHWRAIMDN 

LMTHDKTTFRDLMTRVAVAQSSSLNLFANRDVE 

LEQRAMLLKRLAFAIFSSEIDQYQKYLPDIQERLV 

ESLRLPQVPTLHSQVFLFFRVLLLRMSPQHLTSL 

WPTMITELVOVFLLMEOELTAbEDISRTSGPSVA 

▼T.Jt J. ITXl ' ''If T \^ V i. ' ■'F ■'1*11' V/ ' " ■*' ■"■'■■^ ■ w ^ 

GLETTYTGGNGFSTSYNSQRWLNLYLSACKFLD 

LALALPSENLPQFQMYRWAFIPEASDDSGLEVRR 

QGfflQREFKPYWRLAKLLRKRAKKNPEEDNSG 

RTLGWEPGHLLLTICTVRSMEQLLPFFNVLSQVF 

NSKVTSRCGGHSGSPILYSNAFPNKDMKLENHKP 

CSSKARQKIEEMVEKDFLEGMIKT 


3530 


A 


1 


5684 


VSSVSHENPTEVFEDGENPPSSRSSESGFTEFIQY 

QADRTDDIDRELSEGQGAAAIPIGSTSSETETAST 

VGSEETnQTPSWTQGTATRSRKTAQKTAMQCC 

LEYVQQFLTRLINLYIIQNNSFSQSLATEHQGDLG 

REQGETSKWDRNSQGDVKEKNISKQKTSKEYLS 

AFLAACQLFLECSSFPVYIAEGNHTSELRSEKLET 

DCEHVQPPQWLQTLMNACSQASDFSVQSVAISL 

VMDLVGLTQSVAMVTGENINSVEPAQPLSPNQG 

RVAWmPPLTQGNLRYIAEKTEFFKHVALTLWD 

QLGDGTPQHHQKSVELFYQLHNLVPSSSICEDVI 

SQQLTHKDKKIRMEAHAKFAVLWHLTRDLHINK 

SSSFVRSFDRSLFIMLDSLNSLDGSTSSVGQAWL 

NQVLQRHDIARVLEPLLLLLLHPKTQRVSVQRV 

QAERYWNKSPCYPGEESDKHFMQNFACSNVSQ 

VQLITSKGNGEKPLTMDEIENFSLTVNPLSDRLSL 

LSTSSETIPMWSDFDLPDQQffilLQSSDSGCSQSS 

AGDNLSYEVDPETVNAQEDSQMPKESSPDDDVQ 

QVVFDLICKWSGLEVESASVTSQLEffiAMPPKC 

SDIDPDEETIKIEDDSIQQSQNALLSNESSQFLSVS 

AEGGHECVANGISRNSSSPCISGTTHTLHDSSVAS 

lETKSRQRSHSSIQFSFKEKLSEKVSEKETIVKESG 

KQPGAKPKVKljyiKKDDDKKKSSNEKLKQTSV 
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S£QID 

i NO: 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 



Predicted end 
nucleotide 
.location 
corresponding 
to last amino 
acid residue of 



add residue of peptide 



peptide 
sequence 



sequence 



3531 



553 



2470 



3532 



3931 



317 



ASunridd sequence (A-Alan.ne C=Cysteine, ^^s^;- 
E^GIutamic Acid, I^Phenylalanine, G=Glycme, H=Histidme, 
I-Isoleucine,K=Lysine,I/=LeucincM-Methiomnc, 

T=T^?eonfne,V=Valine,W==Tr^^^^^^^ 

X=Unknown, *«Stop codon. /-possible nucleotide deletion, 

\=possible nucleotide insertion 



FFSDGLDLHNWVSCGEGDISEIESUMOSPI^KKSP 

NFNfflPLYQHVLLYLQLYDSSRTLYAFSAKAn.K 

TNPIAI^AISTrSVNNAYTPQLSLLQl^LAIU^ 

SVMGKDFYSHIPVDSNHNFRSSMYIEILISLCLYY 

MRSHYPTHVKVTAQDLIGNRNMQMMSIEILTLL 

^lSsSAKGH'SFISDMLSKCKVQKVILHC 

LLSSIFSAQKWHSEKMAGKNLVAA^EGFSEDS^^ 

NFSEDEFDNGSTLQSQLLKVLQRLIXALETOVN^T 

IPEE\NETGFDFWS\DLEfflSPHQPMTSLQYLHAQ 

SITCQGMFLCAVIRA\LHQHCACKMHPQWIGLIT 

STipYMGKVLQRVVVSVTLQLCRNLDNLIQQYK 

YETGLSDSRPLWMASnPPDMILTLLEGITAIIHYC 

LLDPTTOYHQLLVSVDQKHLFEARSGILSILHMI 

MSSXmXWSILHQADSSEKMflAASASLT^ 

ATKNLRQQILELLGPISMOTGVHFMAAIAJFVWN 

ERRQNKTTTR'IKVIPAASEEQLLLVELVRSISVM 

RAETVIQTVKEVLKQPPAIAKDKKHLSLEVCML 

OFFYAYIQRIPVPNLVDSWASLLILLKDSIQLSLP 

Sgqflilgvlnefimknpslenkkdqrdlqdvt 

™mMGAIAQSSLEQTTWLRRNLEVKPSPKIM 

^SSd^mlspametaniipsvysvi^ 

TXLSEVLAHLLDMVFYSDEKERVlPLLmi^ 
VP^XRNHSAHNAPSYRACVQLLSSLSGYQYTRR 
AWKKEAFDLFMDPSFFQMDASCVNHWRAMDN 
LMTHDKnTRDLMTRVAVAQSSSLNLFANRDVE 
tS^LKRLAFAIFSSEIDQYQKYLPDIQERLV 

SdSSvptlhsqvflffrvlllrmspqhltsl 

SlTCLVQVFlLMEQELTADEDISRTSGPS^^ 
GLETTYTGGNGFSTSYNSQRWLNLYLSACKFU) 
LALALPSENLPQFQMYRWAFIPEASDDSGLEVRR 

qg^SfkpywWllrkrakknpeednsg 
rSSwepghllltictvrsi^qllpffnvlsqw 

NSKVTSRCGGHSGSPILYSNAFPNKDMKLENHKP 

rsgKA ROKTEEMVEKDFLEG MIKT 

LISPSPALSSQPPALSLKKNLE DISGWULft^RSK 



ESVSFKDVAVDFTQEEWGQLDSPQR^^VM 

LENYQNLLALGPPLHKPDVISHLERGEEPWSMQ 

REWRGPCPEWELKAVPSQQQGICIOEEPAQ™ 
ERPLGGAQAWGRQAGALQRSQAAPVGmXCHG 

lgrpvveefplrcplfaqqrvpeggplldtr^ 

qategrtcaparlcagenastpsepekfpqwrq 

rgagagegefvcgecgkafrqsssltlhrrwhs 

S^^ECGKAFTWSTmLEHRRIHr^^^^ 

cgecgkafschsslnvhqrmtgerpykcsa^k 

NECGKAFSSHAYLWHRRIHTGEKPFDCSQCmA 

fschsslivhqrihtgekpykcsecgrafsqnhcl 

imOlOHSGEKSFKCEKCGEMFNWSSHL'reHQRL 

^^^^^^ 

hrelqdspsabp pagsmplrhwomargskpvou 



fiAOPMAAMGGLKVLLHWAGPGGGEPWVIPSES 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequeoce 


Amino acid sequence (A^Alanine C=Cysteine, D^Aspartic Acid, 
E^lutamic Add, F=>Phenylalanine, GMSIycine, H=Histidine, 
I=Isoleucine, K=Lysine, L^Leudne, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R-Arginine, S=Serioe» 
T^Threoninc, V^Valine, W«Tryptophan, Y=Tyrosine, 
X»Unknown, *=Stop codon, /^possible nadcotide deletion, 
\=possible nucleotide insertion 










SLTAEEVCIHIAHKVGITPPCFNLFALFDAQAQV 
WLPP>m.EIPRDASLNlLYF\RHRFYSR\NWHGM 
NPREPAVYRCGPPGTEASSDQTAQGMQLLDPAS 
FEYLFEQGKHEFVNDVASLWELSTEEEIHHFKNE 
SLGMAFLHLCHLALRHGIPLEEVAKKTSFKDCIP 
RSFRRHIRQHSALTRLRLRNVFRRFLRDFQPGRLS 
QQMVMVKYLATLERLAPRFGTERVPVCHLRLLA 
QAEGEPCYIRDSGVAPTDPGPESAAGPPTHEVLV 
TGTGGIQWWPVEEEVNKEEGSSGSSGRKPQASL 
FGKKAKAHKAFGQPADRPREPLGAYFCDFRDIT 
HVGLKEHCVSIHRQDNKCLELSLPSRAAALSFVS 
LVDGYFRLTADSSHYLCHEVAPPRLVMSIRDGm 
GPLLEPFVQAKLRPEDGLYLIHWSTSHPYRLILTV 
AQRSQAPDGMQSLRLRKFPIEQQDGAFVLEGWG 
RSFPSVRELGAALQGCLLRAGDDCFSLRRCCLPQ 
PGETSNLIIMRGARASPRTLNLSQLSFHRVDQKEI 
TQLSHLGQGTRTNVYEGRLRVEGSGDPEEGKMD 
DEDPLVPGRDRGQELRVVLKVLDPSHHDIALAF 
YETASLMSQVSHTHLAFVHGVCVRGPENIMVTE 
YVEHGPLDVWLRRERGHVPMAWKMVVAQQLA 
SALSYLENKNLVHGNVCGRNILLARLGLAEGTSP 
FIKLSDPGVGLGALSREERVERIPWLAPECLPGG 
ANSLSTAMDKWGFGATLLEICFDGEAPLQSRSPS 
EKEHFYQRQHRLPEPSCPQLATLTSQCLTYEPTQ 
RPSFRmRDLTRLQPHNLADVLTVNPDSPASDPT 
VFHKRYLKKIRDLGEGHFGKVSLYCYDPTODGT 
J^WAWALKADC^^^ 

PRHSIGLAQLLLFAQQICEGMAYLHAQHYIHRDL 

AARNVLLDNDRLVKIGDFGLAKAVPEGHEYYRV 

REDGDSPVFWYAPECLKEYKFYYASDVWSFGVT 

LYELLTHCDSSQSPPTKFLELIGIAQGQMTVLRLT 

ELLERGERLPRPDKCPCEVYHLMKNCWETEASF 

RPTFENLPILKTVHEKYQGQAPSVFSVC 


3533 


A 


182 


3465 


FRWLDFFRGSINSQFEFGRKKENMTSPAKFKKDK 

EIIAEYDTQVKEIRAQLTEQMKCLDQQCELRVQL 

LQDLQDFFRKKAEIEMDYSR>ILEKLAERFLAKT 

RSTKDQQFKKDQNVLSPVNCWNLLLNQVKRES 

RDHTTLSDIYLNNIIPRFVQVSEDSGRLFKKSKEV 

GQQLQDDLMKVLNELYSVMKTYHMYNADSISA 

QSBCLKEAEKQEEKQIGKSVKQEDRQTPRSPDSTA 

hn/RffiEKHVRRSSVKKIEKMKEKRQAKYTENKL 

KAJKARNEYLLALEATNASVFKYYIHDLSDLIDQ 

CCDLGYHASLNRALRTFLSAELNLEQSKHEGLD 

AIENAVENLDATSDKQRLMEMYNNVFCPPMKFE 

FQPHMGDMASQLCAQQPVQSELLQRCLQLQSRL 

STLKIENEEVKKTMEATLQTIQDIVTVEDFDVSD 

CFQYSNSMESVKSTVSETFMSKPSIAKRRANQQE 

TEQrm^KMKEYLEGR]^ITBCLQAKHDLLQKTL 

GESQRTDCSLARRSSTVRKQDSSQAIPLWESCIR 

nSRHGLQHEGIFRVSGSQVEVNDIKNAFERGEDP 

LAGDQNDHDMDSIAGVLKLYFRGLEHPLFPKDIF 

HDLMACVTMDNLQERALHIRKVLLVLPKTTLn 

MRYLFAFLNHLSQFSEENMMDPYNLAICFGPSL 

MSVPEGHDQVSCQAHVNELIKTraQHENIFPSPRE 



374. 



wo 01/57190 



SEQID 
NO: 



Predicted 

beginning 

nucleotide 

location 
I corresponding 
I to first amino 
I add residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



3534 



3535 



3536. 



983 



PCT/USO 1/04098 

-7—: lA cnnnpnre <A=Alanine O Cysteine, ii^Aspartic Acid, 

t t.«i«.rmp K=Lvsine, LpLcucincM«Methiomne, 

V^possible nudeoUde inserton 



TSHHSroECEPIEAIAKFDYVGRTAm.SFKK 
G™LYQRASDDWWEGRHNGIDGLIPHQYIW 

Sdtcdg\^spkseieviseppeekvtaragas 

CpSSrVADIYLANINKQRKRPESGSIRKimSDS 
SfflSS^TOSSSPGVGASCRPSSQPIMSQSLPKEG 

fpLHTOLLKDPEPAFQRSASTAGDIACAFRPVKS 
vSvSpA™^^VFPKTOATSPGVNS^ 



S^^^AAGLRDAASSAPRGMASbOHU. 

S^iSSSqWirnentcplpqemkalfk 

SrSiTYlXJQKFDSERADGTISSEIKSARGS 
SSsL^MYHKRTDRKSRHAKNVSTS 
SFmDSQGAENNMSEIQKQPKWGPVH 

SmsliSevviSaavlskgeivvknnpnesv 

S?WSCTOELSWTPMGYVVRQTLSTELS 
I^P^VTSMINLKTIASSADPKNVSIPSSEM^SD 
?^SiSmnHPTQKSKASQGSDLEQNEASRKNK 
SaCEKSTSKWVLTVQEPPRIEDAEEFPNLAVAS 

SSk^skqqpqdnfknnvkksqlpvql 

SSmiKKQHSQHAKQSSKPVVVSVGAV 

?vSa^ergVsqmktphnpldss™ 

ScGl^REIPKAKKPTSLKKnLKERQERKQRLQE 

SvSSSddtqdgesggddqfpeqaelsgpeg 

SSusmVH)KSEEPPGmQRD^ 

SSsrrfrdycsqmlskevdacvtdllke 

I'SSKCVnSPNCEKIQSKGGLDDTLHTIIDYA 
cSSSSALGRSLNKAyPVSWGIFSY 
DSoDOTHKMVELTVAARQAYKTmEN^^^ 

?5g?SsSayphrapaalqkmapqp/v^k 
eSJymwkkhleaysgctleleesleastsqm 

M>JLNL 



^^^^^^^ 

iESJSvYIYSSGSVEAQKLLFGHSTEGDILELy 
^SmGHKVESESYRKIADSlGCSTNNILFLT 

S?^IKadvhvavwrpgnagltodek 

TYYSLITSFSELYLPSST 



1302 



GRPPTAPHTGRPPTANRGDPRLU LKROCAKLLia 



^srgrpaasaglrrdrcalrrwpu<raplarat 

SGSmCA^RPRACPQGWSRARHQPGGLCL 
fSuxSDRSAQAGircWLRQAia^GRCQ^^ 
^mScCSTGRLSTSWTEEDVNDmLFKW 

^JSgaScipcketcenvdcgpgkkcrm^ 

ScVolpiXSNrrWKGPVCGLDGKTYRNECA 
^SpnPELEVOYQGRCKKTCRDVFCPGSS, 
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SEQiP 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to flrst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D^Aspartic Add, 
EXZlutamic Acid, F^Phenylalanine, G^Glydne, H=Hlstidine, 
I=IsoIeucine, K=Lysine, Lr=Leucine, M-Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V=VaUnc, W=Tryptophan, Y=Tyrosine, 
X=Dnknown, *=Stop codon, /^possible nudeotide deletion, 
V=ppssible nudeotide insertion 










TCVWDQTNNAYCVTCNRICPEPASSEQYLCGND 
GVTYSVSACHLRKATCLLGRSIGLAYEGKCIKAK 
SCEDIQCTGGKKCLWDFKVGRGRCSLCDELCPD 
SKSDEPVCASDNATYASECAMKEAACSSGVLLE 
VKHSGSCNSISEDTCEEEEDEDQDYSFPISSELEW 


3537 


A 


285 


2123 


IGLFLQVAPLSVMAKSCPSVCRCDAGHYCNDRF 

LTSIPTGIPEDATTLYLQNNQINNAGIPSDLKNLL 

K\nERIYLYHNSLDEFPTNLPKYVKELHLQENNIR 

TITYDSLSKIPYLEELHLDDNSVSAVSIEEGAFRD 

S^^Vl,RLLFLSIWHLSTIPWGLPRTffiELRLDDNRIS 

TISSPSLQGLTSLKRLVLDGNLLNNHGLGDKVFF 

NLVNLTELSLVRNSLTAAPVNLPGTNLRKLYLQ 

DNHINRVPPNAFSYLRQLYRLDMSNNNLSNLPQ 

GITODLDNITQLILRNNPWYCGCKMKWVRDWL 

QSLPVKVNVRGLMCQAPEKVRGMAIKDLNAELF 

DCKDSGWSTIQITTAIPNTVYPAQGQWPAPVTK 

y^lTLJlJSJ^t rsA^ X. S^jJilW^ 1 X VJOa OXVXV 1 X J. J. X V JX«j V X uX^ X X 

mSWKLALPMTALRLSWLKLGHSPAFGSITETIVT 

GERSEYLVTALEPDSPYKVCMVPMETSNLYLFD 

ETPVCmTETAPLRMYNPTTTLNREQEKEPYKNP 

NLPLAAIIGGAVALVTIALLALVCWYVHRNGSLF 

SRNCAYSKGRRRKDDYABAGTKKDNSILEIRETS 

FQMLPISNEPISKJEEFVmXIFPPNGMNLYKNNH 


3538 


A 


877 


6184 


WNVKPSLLVVQLFKFSDKEEHEQNDSISGKTGET 

GVEEMIATRKVEQDSKETVKLSHEDDHILEDAGS 

SDISSDAACTNPNKTENSLVGLPSCVDEVTECNL 

ELKDmGIADKTENTLER^nCIEPLGYCEDAESNR 

QLESTEFNKSNLEWDTSTFGPESNILENAICDVP 

DQNSKQLNAIESTKBESHETANLQDDRNSQSSSV 

SYLESKSVKSKHTKPVmSKQNMTTDAPKKIVAA 

KYEVIHSKTKVNVKSVKRNTDVPESQQNFHRPV 

KVRKKQIDKEPKIQSCNSGVKSVKNQAHSVLKK 

TLQDQTLVQIFKPLTHSLSDKSHAHPGCLKEPHH 

PAQTGHVSHSSQKQCHKPQQQAPAMKTNSHVK 

EELEHPGVEHFK^EDKLKLKKPEKlNlLQPRQRRSS 

KSFSLDEPPLFEPDNIATIRREGSDHSSSFESKYMW 

TPSKQCGFCKKPHGNRFMVGCGRCDDWFHGDC 

VGLSLSQAQQMGEEDKEYVCVKCCAEEDKKTEI 

LDPDTLENQATVEFHSGDKTMECEKLGLSKHTT 

NDRTKYmDTVKHKVKTLKRESGEGRNSSDCRD 

NEIKKWQLAPLRKMGQPVLPRRSSEEKSEKIPKE 

STTVTCTGEKASKPGTHEKQEMKKKKV\EKGVL 

NVHPAASASKPSADQmQSVRHSLKDELMKRLTD 

SNLKVPEEKAAKVATKIEKELFSFFRDTDAKYKN 

KYRSLMF>«LKDPKNNILFKKVLKGEVTPDHLIR 

MSPEELASKELAAWRRRENRHTIEMIEKEQREVE 

RRPITKITHKGEIEIESDAPMKEQEAAMEIQEPAA 

NKSLEKPEGSEK\RKEEVDSMSKDTTSQHRQHLF 

DLNCKICIGRMAPPVDDLSPKKVKVWGVARKH 

SDNEAESIADALSSTSNILASEFFEEEKQESPKSTF 

SPAPRPEMPGTVEVESmARLNFIWKGFINMPS 

VAKFVTKAYPVSGSPEYLTEDLPDSIQVGGRISPQ 

TVWDYVEKIKASGTKEICVVRFTPVTEEDQISYT 

LLFAYFSSRKRYGVAANNMKQVKDMYLIPLGAT 

DKIPHPLVPFDGPGLELHRPNLLLGLIIRQKLKRQ 
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A5ta5¥ci(l sequence (A-=Alanine C=Cyste!ne. D=Asp»rtic Add. 

l=faoleudBe,K=Lysine,D=Uucine,M=MethioDine, 
N=Asparagine, P=Proline, Q=GI»tamine, R=Arguune, S=Senne, 

X=Un^own>=Stopcodon,/=pos5iWe Budeot.de deletion, 
^possible nucleotide insertion 



HSACASTSHlAETPESAPPiALPP DKKSKJEVSlnE 
VEPLMEVTKQEPPKPLRFLPGVLIGWENQPTTLE 

lSJkplpvddilqsllgttgqvydqxaqsvmeq 

KrVKEIPFLNECyrNSKIEKTDNVEVn)GENKEIK 

vkvdnisbstoksaeietsvvgsssisagsltslsl 
rgkppdvsteafltolsiqskqeetveskektlkr 

OLOEDOENNLQDNQTSNSSPCRSNVGKGNIDGN 

VSCSEl«n.VAOTARSPQFINLKM3PRQAAGR^PV 

TOESKDGDSCRNGEKHMLPGLSHNKEHLTEOm 

^S^CSAEKNSCVQQSDl^VAQNSPSV^^^ 

SOAEOAKPLQEDILMQNIETVHPFRRGSAVATSH 

FEVGNTCPSEFPSKSITFTSRSTSPRTSTNFSPMRP 

QQPNLQHLKSSPPGFPFPGPPNFPPQSMFGroPm 

PPPLLPPPGFG\FA\QNPMVPWPPV\AHLP\GQPQR 

Sgplsqasryigpqnfyqvkdirrph^kdp 

WGRQDQQQLDRPFNRGKGDRQRFYSDSHHLO^ 

eSeweqeserhrrrdrsqdkdrdrksreeg 
^^arlshgdrgtdgkasrdsrnvd^d 
kpksedyekdkebekskhregekdrdryhkdr 

dhtdrtkskr 



gswtvelslkpsaspslkwvclfoaaavn^s 
gagglirsuqctwapagparrggrgiedfpylf 

FOLTOCQQRICSVTQAGVQWCDHSSLpPQTPGL 
NQSSHLaisSRDYRl^SSFNEWFWQDRFmPP 
NWWTELEDRDGRVYPHPQDLLAALPLALVLLA 

MRlAFERFIGCPLSKWLGVRIXJTRRQ 

EKHFLTEGHRPKEPQLSLLAAQCGL-n^QTQR^^ 

iRRRRNQDRPQLTKKFCEASWRFLFYLSSFVGGL 

SVLYHESWLWAPVMCWDRYPNQLTLSCPAADS 

EA\SLYWil.LELGFYLSLLIRLPFDVKRKGGGP 

^KPRPHYDPPSTANDFKEQVIHHFVAVILMTFSY 

SS^LVLLLHDSSDYLLEACKMVWl^Y 

OOVCDALFLIFSFVFFYTRLVLm^ 
SMIGPFFGYYFFNGLLMLLQLLHVFWSCLDLRML 
YSFMKKGQMEKDIRSDVEESDSSEEAAAAQm. 
QLKNGTAGGPRPAPTDGPRSRVAGRLTNRHTTA 

T 



SPAGYCHSOLLPGCSRSA/CAUL AKHQ^LfUJua 

LSEKKIJKRYFVDYRRVLVCGGNGG^^^^ 
PRKEFGGPDGGDGGNGGHmRVDQQVKSLSSV 

LSRYQGFSGEDGGSKNCFGRSGAVLYIRVPVG'n. 

vSggrwadlscvgdeyiaalggaggkgn^ 

FLANNNRAPVTCTPGQPGQQRVLHLELKTVAHA 

S^vSagkssllrAisnarpavasypftilkp 

HVGIVHYEGHLQIAVADIPGIIRGAHQNRGLGSA 

SScrfllfWdlsqpepwtqvddlk^e 
myekglsarphaivankidlpeaqanlsqlrdh 
lgqevivlsaltgenleqlllhlkvlydayaea 

elgqgrqplrw 



DTQVSETLKRFAGKV 1 IAS V KERBEILSliLUKC V 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to fir^t amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C=Cysteine, l>=A5partic Acid, 
£==Glutamic Acid, F^Phenylalanine, G=<:iyctne, H»Histidiney 
I^lsoleucine, K=Lysine, L^Lencine, M=Methionine, 
N=Asparagine, P=Proline, Q^Glutamine, R^Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=^top codon, A^possible nucleotide deletion, 
V=posslble nucleotide insertion 








1 


KLWKENPGLVEQYLSAILSLEPNQNYAGMLGLL 
VQFCTSHKEMDWSQHKSAlXDFmKNILMSK 
VKPPKYLLDSCAPLLRYLSHSEFKDLILFnQKSL 
LRSPEN^TISSLLASVTLDLSQYAMDIVKGLAG 
HLKSNSPRLMDEAVLALRM.ARQCSDSSAMESL 
TKHLFAILGGSEGKLTWAQKMSVLSGIGSVSHH 
WSGPSSQVLNGIVAELFIPFLQQEVHEGTLVHA 
VSVLALWCNRFIMEVPKKLTEWFKKAFSLKTST 
SAVRHAYLQCMLASYRGDTLLQALDLLPLLIQT 
VEKAASQSTQVPTITEGVAAALLLLKLSVADSQA 
EAKLSSFWQLIVDEKKQVFTSEKFLVMASEDAL 
CTVLHVLTERLFLDHPHRLTGNKVQQYHRALVA 
VLLSRTWHVRRQAQQTVRKLLSSLGGFKLAHGL 
LEELKTVLSSHKVLPLEALVTDAGEVTEAGKAY 
VPPRVLQEALCVISGVPGLKGDVTDTEQLAQEM 
LIISHHPSLVAVQSGLWPALLARMKIDPEAFITRH 
LDQHPRMTTQSPLNQSSMNAMGSLSVLSPDRVL 
PQLISTITASVQNPALRLVTREEFAIMQTPAGELY 
DKSIIQSAQQDSIKKANMKRENKAYSFKEQIIELE 
LKEEIKKKKGIKEEVQLTSKQKEMLQAQLDREA 
QVRRRLQELDGELEAALGLLDIILAKNPSGLTQYI 
PVLVDSFLPLLKSPLAAPRIKNPFLSLAACVMPSR 
LKALGTLVSHVTLRLLKPECVLDKSWCQEELSV 
AVKRAVMLLHTHTITSRVGKGEPGAAPLSAPAFS 
LVFPFLKMVLTEMPHHSEEEEEWMAQILQILTVQ 
^AQUIASPNTPPGRVDENGPELLPRVAMLRLLTW 
^AftGTbsPRti^VLASDTLTTLCASSSGDDGCAFi^ 

qeeVdvllcalqspcasvretvlrglmelhmvl 
papdtdeknglnllrrlwvvkfdkeeeirklae 

RLWSMMGLDLQPDLCSLLIDDVIYHEAAVRQAG 

AEALSQAVARYQRQAAEVMGRLMEIYQEKLYR 

PPPVLDALGRVISESPPDQWEARCGLALALNKLS 

QYLDSSQVKPLFQFFVPDALNDRHPDVRKCMLD 

AALATLNTHGKENVNSLLPVFEEFLKNAPNDAS 

YDAVRQSVVVLMGSLAKHLDKSDPKVKPIVAKL 

lAALSTPSQQVQESVASCLPPLVPAIKEDAGGMIQ 

RLMQQLLESDKYAERKGAAYGLAGLVKGLGILS 

LKQQEMMAALTDAIQDKKNFRRREGALFAFEM 

LCTMLGKLFEPYWHVLPHLLLCFGDGNQYVRE 

AADDCAKAVMSNLSAHGVKLVLPSLLAALEEES 

WRTKAGSVELLGAMAYCAPKQLSSCLPNIVPKL 

TEVLTDSHVKVQICAGQQALRQIGSVIRNPEILAI 

APVLLDALTDPSRKTQKCLQTLLDTKFVHFIDAP 

SIALIMPIVQRAFQDRSTDTRKMAAQIIGNMYSL 

TDQKDLAPYLPSVTPGLKASLLDPVPEVRTVSAK 

ALGAMVKGMGESCFEDLLPWLMETLTYEQSSV 

DRSGAAQGLAEVMAGLGVEKLEKLMPEIVATAS 

KVDIAPHVRDGYIMMEWXPITFGDKFTPYVGPn 

ALLLPQLEQGLFDDLWRIRFSSVQLLGDLLFHISG 
VTGKMTTETASEDDNFGTAQSNKAnTALGVERR 

NRVLAGLYMGRSDTQLWRQASLHVWKIVVSN 
TPRTLREILPTLFGLLLGFLASTCADKRTIAARTL 
GDLVRKLGEKDLPEIIPILEEGLRSQKSDERQGVCI 
GLSEIMKSTSRDAVLYFSESLVPTARKALCDPLE 
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SEQ ID IV 
NO: 


LCIDOIl s 

b 
n 
U 
c 
t 
a 

I 
s 


redicted I ^ 
eginning n 
ucleotide h 
ication c 
orresponding t 
D first amino a 
ddresidoeof ( 
peptide s 
ct^uence 1 


redicted end / 
ucleotide E 
ication I' 
orresponding 
D last amino 1 
cid residue of > 
peptide V 
equence 


.minoacidseqi^nce (A-Alanine C-Cystcine, u-Asparuc aciu, 
r=GJutamic Acid, P=PhcnyIaJanine, ijr-uiyane, a aauui , 
=Isoleucine,K=Lysinc,Ir=Lcucine,M=Mcthlonine, 

ji^pa^gi;e,p4»roIine,<>^GIu^ 
UhLnine,V^Varine,W=^Tryptophan,™^^^ 
C=Unknown, *-Stop codon, /=possible nucleotide deletion. 
=po$5ibie nucleotide insertion 








] 

: 


BVREAAAKlTEQLHSTlGHQALbDEJ'FLLKQLD 

DEEVSEFALDGLKQVMAIKSRWLPYLVPKLTTP 

PVNTRVLAFLSSVAGDALTRHLGVILPAVMLAL 

KEKLGTPDEQLEMANCQAVILSVEDDTGHRIIIE 

DLI^TRSPEVGMRQAAAIILNIYCSRSKADY^ 

HLRSLVSGLIRLF^©SSPVVLEESWDALNMT^^ 

Sagnqlalieelhkeirligneskgehvpgfclp 

KKGVTSILPVLKEGVLTGSPEQKEBAAKALGLyi 

rltsadalrpswsitgplirilgdrfswnvkam- 

LETLSLLLAKVGIALKPFLPQLQTTFTKALQDSNR 

GVRLKAADALGmsmiKVDPLFIELLNGIRAME 

DPGVRDTMLQALRFVIQGAGAKVDAVIRKNIVS 

LLLSMLGHDEDTTOISSAGCLGELCAPLTEEHL 

AVLOOCLLADVSGIDWMVRHGRSLALSVAVNV 

APGRLCAGRYSSDVQEMILSSATADRmAySGV 

RGMGFLMRHHIETGGGQLPAKLSSLFVKCLQNP 

SSDIRLVAEKMIWWANKDPLPPLDPQAIKPILKA 

LLDKrKDKNT\rVRAYSDQAIVNLLKMRQGEEW 

QSLSKILDVASLEVLNEVNRRSLKKLASQADSTE 


3542 


A 


62 


1130 


PWNPODFPGNRGLMG\QKGHIGPP\GQyu*J^UAr 
GMP\GLMGSNGSPGQPGTPGSKGSKGEPGIQGMP 

gaSgepgatgspgepgymglpgiqgkkgdk 

GNQGEKGIQGQKGENGRQGIPGQQGIQGHHGAK 

gergbkgepgvrgaigskgesgvdglmgpagpk 

GQPGDPGPQGPPGLDGKPGREFSEQFIRQVCTDV 
mAQLP^flLLQSGRIRNCDHCLSQHGSPGIPGPPGPI 
GPEGPRGLPGLPGRDGWGLVGVPGRPGVRGLK 
GLPGRNGEKGSQGFGYPGEQGPPGPPGPEGPPGI 
SKEGPPGDPGLPGKDGDHGKPGIQGQPGPPGICD 
PSTCFSVlABJEaDPFRKGPNY _ 


3543 


A 


654 


194 


■ PARSLEKMKASVVLSLl^YLWFSGAmGRCn^ 
Ai(M>GGLDYFERYSLENWVCLAYFESKFNPS\ 
AIYENTREGYTGFGLFQMRGSDWCGDHGRNRC 
HMSCSALLNPNLEKTIKCAKTIVKGKEGMGAWP 
XW«wvmY5?DTLARWLDGCKL 


3544 


A 


2 


1074 


" SCRLAAGRLAQWLLRASRSGMLRAUW1.KUAAA 

lSIaSwWitvglaigaasmtgyl^^^^ 

NDIYCRFAECCREERPLNASALKLDLEEKLFGQH 

LATEVJXFKALTCFRNNKNPKKPLILSLHGWAGT 

GKNFVSQMGAENLHPKGLKSNFVHLFVSTUffP 

HEOKIKLYQDQLQKWIRGNVSACANSVFIFDEM 

^\HPGIIE\AKPFLDYYEHVERVSYRVKAIFIFLS 

NAGGDLITKTAli>FWRAGRKREDIQLroLEP\a. 

SVGVFNNKHSGLWHSGLIDKNLIDYFIPFLPLi^ 

HVkMCVRAEMRARGSAIDEDIVTR.VAEEMTFFP\ 

pnpif TYSnKGCKTVOSRLDFH 


3545 


A 


3 


273 


- "sAQGRSWGKi-VRQIKRHPGm'MlGU^MGSA 
ALYLLRLALRSPDVWSWDRKNNPEPWNRLSPN 
T-vr\w*T7T A\/QTnvT(rKT KKDRPDF 


3546 


A 


23 


591 


- ALSTETRTPDMRRLLLVl^LV VVLLWJbAOAVl'A 
PKVPimQVKirSia>SEQDPEKAWGARVVEPPEK 
DDOLVVLFPVQKPKLLTTEEKPRGQGRGPILPGT 
KAWMETEDTLGRVLSPEPDHDSLYHPPPEEDQG 
raWPRLWVMPNHOVLLGPEEDQDHryHPQ*GSR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, INAspartic Acid, 
E^Iutamic Add, F-Phenylalanine, &=Glycine, H^'Histidine, 
I=Isoleudne, K=Lysine, Lr=Leacine, M=Metfaionlne, 
N==Asparagine, P=ProHne, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W^TryptophaD, Y=Tyrosine, 
X'=Unknown, *=Stop codon, /^possible nudeotide deletion, 
V==possible nudeotide insertion 










GHHCPRPVPRPRLLGLGPSLPCPS 


3547 


A 


23 


591 


ALSTETRTPDMRRLLLVTSLVWLLWEAGAVPA 
PKWEKMQVKHWPSEQDPEKLAWGARVVEPPEK 
DDQLWLFPVQKPKLLTTEEKPRGQGRGPILPGT 
KAWMETEDTLGRVLSPEPDHDSLYHPPPEEDQG 
EERPRLWVMPNHQVLLGPEEDQDH]YHPQ*GSR 
GHHCPRPVPRPRLLGLGPSLPCPS 


3548 


A 


3 


1641 


TWLPSVPAEEVQQPEMAAVLNAERLEVSVDGLT 

LSPDPEERPGAEGAPLAAATAATALATWIRSRPG 

RLRGTARSPGRRAAGGAAEEARRLEQRWGFGLE 

ELYGLALRFFKEKDGKAFHPTYEEKLKLVALHK 

QVLMGPYNPDTCPEVGFFDVLGNDRRREWAAL 

GNMSKEDAMVEFVKLLNRCCHLFSTYVASHKIE 

KEEQEKKRKEEEERRRREEEERERLQKEEEKRRR 

EEEEBILRREEEERRRIEEERLRLEQQKQQIMAAL 

NSQTAVQFQQYAAQQYPGNYEQQQILIRQLQEQ 

HYQQYMQQLYQVQLAQQQAALQKQQEVWAG 

LEPEAAEEALENGPKESLPVL^APSMWTRPQIKD 

FKEKIQQDADSVITVGRGEVVTVRVPTHEEGSYL 

FWEFATDNYDIGFGVYFEWTDSP^AVSVHVSE 

SSDDDEEEEENIGCEEKABCKNANKPLLDEIVPVY 

RRDCHEEVYAGSHQYPGRGVYLLKFDNSYSLW 

RSKSVYYRVYYTR 


3549 

-'t ^ ; * - -fi . 


A 


1837 


3593 


PAVLVLEPASQSRXQQNTASATAQHWSAQIHKE 

SFLAPVFTKDEQKHRRPYEFEVERDAKARGLEQF 

SATHGHTPIILNGWHGESAMDLSCSSEGSPGAtS ' 

PFPVSASTPKIGAISSLQGALGMDLSGILQAGLIHP 

VTGQIVNGSLRRDDAATRRRRGRRKHVEGGMD 

LIFLKEQTLQAGILEVHEDPGQATLSTTHPEGPGP 

ATSAPEPATAASSQAEKSIPSKSLLDWLRQQADY 

SLEVPGFGANFSDKPKQRRPRCKEPGKLDVSSLS 

GEERVPAIPKEPGLRGFLPENKFNHTLAEPILRDT 

GPRRRGRRPRSELLKAPSIVADSPSGMGPLFMNG 

LIAGMDLVGLQNMRNMPGIPLTGLVGFPAGFAT 

MPTGFFVKSTT ^MT PMMT PGMA AVPOA/rFfrVOG 

LLSPPMATTCTSTAPASLSSTTKSGTAVTEKTAE 

DKPSSHDVKTDTLAEDKPGPGPFSDQSEPAITTSS 

PVAFNPFLIPGVSPGLIYPSMFLSPGMGMALPAM 

QQARHSEIVGLESQBCRKKKKTKGDNPNSHPEPA 

PSCEREPSGDENCAEPSAPLPAEREHGAQAGEGA 

LKDSNNDTN 


3550 


A 


287 


39 


QLNLNFJATSQKHRDFVAESVGEKPVGSLAGIGE 
VMDKKLEEGCFDKAYVVLGQFLVLKKDEDLF*E 
WLRDTGGARTRGSRE 


3551 


A 


21 


3925 


GDLLEVGLPPGLEFPRGICLRGLRRTMSLDFGSV 
ALPVQNEDEEYDEEDYEREKELQQLLTDLPHDM 
LDDDLSSPELQYSDCSEDGTDGQPHHPEQLEMS 
WNEOMLPKSOSVNGPSCOGLEPYNKVTYKPYOS 

TT X^Xrf>^XTAA^X X^W^^h^ T X ^ XJX. k^ X^^^ XJ JL^X.^X X X^X^ ▼ X X JL^A X ^ 

SAQNNGSPAQEITGSDTFEGLQQQFLGANENSAE 

NMQIIQLQ\a.NKAKERQLENLIEKLNESERQIRY 

LNHQLVnKDEKDGLTLSLRESQKLFQNGKJEREIQ 

LEAQIKALETQIQALKVNEEQMIKKSRTTEMALE 

SLKQQLVDLHHSESLQRAREQHESrVMGLTKKY 

EEQVLSLQKNLDATVTALKEQEDICSRLKDHVK 
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\«possiWe nucleotide insertion 

SSSSaevqrllgsnsmkkhlvsqlqndlk 

J'^S'oSSQKKWEEQIffiVSVNKKISFAVSE 

a™SSqeskeraaemvkaevl»erq 

JSSv^£tQQlLQDlX5KEGAEKKIMNA 

^SSppSitDQAVEAMFPPARGKELLSFEDVA 
^^SSvGQKDLYRDVMLENYBNMV 




SEV^GimiF^ 
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SEQID 
NO: 


Method 


Predicted 

beginniog 

nucleotide 

location 

corresponding 

to Grst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid seqncnce (A=Alanine OCysteine, D^Aspartic Add, 
£>=Giutamic Add, F^Phenylalanine, G=Glycine, H^Histidine, 
I-Isoleucine, K^Lysine, ]>Leudney M^Methionine, 
N=»Asparagine, P==FroHne, Q=^lutamine, R-Arginine, $=Serine, 
T^Threoninc, V=VaIine, W==Tryptophan, Y==Tyrosine, 
X^Unknown, *^top codon, /^possible nudeotide ddetion, 
\-possible nudeotide insertion 










HRHLNPDTELKRYFGARAILGEQRPRQRQRVYP 

KCTWLTTPKSTWPRYSKPGLSMRLLESKKGLSFF 

AFEHSEEYQQAQHKFLVAVESMEPNNIVVLLQT 

SPYHVDSLLQLSDACRFQEDQEMARDLVERALY 

SMECAFHPLFSLTSGACRLDYKRPENRSFYLALY 

KQMSFLEKRGCPRTALEYCKLILSLEPDEDPLCM 

LLLIDHLALRARNYEYLIRLFQEWEVGASLAHRN 

LSQLPNFAFSVPLAYFLLSQQTDLPECEQSSARQ 

KASLLTOOALTMFPGVLLPLLESCSVRPDASVSSH 

RFFGPNAEISQPPALSQLVNLYLGRSHFLWKEPA 

TMSWLEENVHEVLQAVDAGDPAVEACENRRKV 

LYQRAPRNIHRHVILSEIKEAVAALPPDVTTQSV 

MGFDPLPPSDTIYSYVRPERLSPISHGNTIALFFRS 

LLPNYTMEGERPEEGVAGGLNRNQGLNRLMLA 

VRDMMANFHLNDLEAPHEDDA*GEGEWD 


3555 


A 

. ' **■' • 


2 

. v ■ 


2106 


FDEFSALPSPSLQTSWSFGPMSRRALRRLRGEQR 
GQEPLGPGALHFDLRDDDDAEEEGPKRELGVRR 
PGGAGKEGVRVNNRFELINIDDLEDDPVVNGERS 
GCALTDAVAPGNKGRGQRGNTESKTDGDDTET 
WSEQSHASGKLRKKKKKQKNKKSSTGEASENG 
LEDIDRDLERIEDSTGLNRPGPAPLSSRKHVLYVE 
HRHLNPDTELKRYFGARAILGEQRPRQRQRVYP 
KCTWLTTPKSTWPRYSKPGLSMRLLESKKGLSFF 
AFEHSEEYQQAQHKFLVAVESMEPNNIWLLQT 
SPYHVDSLLQLSDACRFQEDQEMARDLVERALY 
SMECAFHPLFSLTSGACRLDYRRPENRSFYLALY 
;^^QMSFLEiaiGCPRTAEEYCmLSLEPDEDPLCM 
LLLDDHLALRARNYEYLIRLFQEWEVGASLAHRN 
LSQLPNFAFSVPLAYFLLSQQTDLPECEQSSARQ 
KASLLIOOALTN4FPGVLLPLLESCSVRPDASVSSH 
RFFGPNAEISQPPALSQLVNLYLGRSHFLWKEPA 
TMSWLEENVHEVLQAVDAGDPAVEACENRRKV 
LYQRAPRNIHRHVILSEIKEAVAALPPDVTTQSV 
MGFDPLPPSDTIYSYVRPERLSPISHGNTIALFFRS 
LLPNYTMEGERPEEGVAGGLNRNQGLNRLMLA 
VRDMMANFHLNDLEAPHEDDA*GEGEWD 


3556 


A 


3388 


1650 


KTRGTMFYYPNVLQRHTGCFATIWLAATRGSRL 

VKREYLRVNWKTCEEILNYVLVRVQPPQPGLP 

RPRFSLYLSAQLQIGVIRVYSQQCQYLVEDIQHIL 

ERLHRAQLQIRroMETELPSLLLPNHLAMMETLE 

DAPDPFFGMMSVDPRLPSPFDIPQIRHLLEAAIPE 

RVEEIPPEVPTEPREPERIPVTVLPPEAITILEAEPIR 

MLEBEGERELPEVSRRELDLLIAEEEEAILLEIPRL 

PPPAPAE*GQELLDQVGCQCWEGSPHFSCPFPLR 

VEGMGEALGPEELRLTGWEPGALLMEVTPPEEL 

RLPAPPSPERRPPVPPPPRRRRRRRLLFWDKETQI 

SPEKFQEQLQTRAHCWECPMVQPPERTIRGPAEL 

FRTPTLSGWLPPELLGLWTHCAQPPPKALRRELP 

EEAAAEEERRKDEVPSEDEVPREALEPSVPLMVSL 

EISLEAAEEEKSRISLIPPEERWAWPEVEAPEAPA 

LPWPELPEVPMEMPLVLPPELELLSLEAVHRAV 

ALELQANREPDFSSLVSPLSPRRMAARVFYLLLV 

LSAQQILHVKQEKPYGRLLIQPGPRFH 


3557 


A 


3388 


1650 


KTRGTMFYYPNVLQRHTGCFATIWLAATRGSRL 
VKREYLRVNWKTCEEBLNYVLVRVQPPQPGLP 



382 
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SEQID 
NO: 



Method 



Predicted 
beginning 
nudeotide 
locatioD 
corresponding 
to first amino 
acid residue of 
peptide 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 



3558 



489 



2360 



3559 



489 



3560 



2360 



1198 



Amino acid sequence (A^Alanine C^ Cysteine, ^'^sff'T^^^"^' 
Stemic Add, F^Phenylalanine, G=Glydne, H=Hlstidme, 
i-icAiPiipine. K=Lvsinc L=Lcudne, M=Metliionme, 

?=Threonine,V=VaIine.W=Tryptopban.Y=Ty^^^^^^ 
X=Unknown, ♦'^top codon, /-possible nudeofide deletion. 
V^possible nudeotide insertion 



ERLHRAQLQDUDMETELPSLLLPNHLAMME'M 
DAPDPFFGMMSVDPRLPSPFDIPQIRHLLEAAIPE 
RVEEIPPEVPTEPREPERIPVTVLPPEAITILEAEPIR 
MLEffiGERELPEVSRRELDLLIAEEEEAILLEIPRL 
FWAPAE*GQELLDQVQCQCWEGSPHFSCPFPLR 
V^GMGEALGPEELRLTGWEPGALLMEVTPreH. 

rlpappsperiu>pvpppprrrrri*llfwdketqi 
spekfoeqlqtrahcwecpmvqpperhrgpael 

FRTPT^SOmPPELLGLWTHCAQPPPKALRRELP 

^aaaeeerrkievpseievprealepsvplmvsl 

EISLEAAEEEKSRISLIPPEERWAWPEVEAPEAPA 

lpwpelpevpmemplvlppelbllsleavhrav 

ALELQANREPDFSSLVSPLSPRKMAARVFYLLLV 

, QAnon H VKOEKPYGRLLIQPGPRFH _ 

• mPRPRGRRRALDSPNAAAPPV V V CRSPOMKl^L. 

^Eediaklaetlaktqvaggqlsfkgk^^^^^ 

LNTAEDAKDVIKEffiDFDSLEALRLEGNTVGVEA 
ARVlAfcAL*KKSEllCRCHWSDMFTGRLRTEIPPA 

USLGEGLITAGAQLVEmi^DN^GPD^^^^^^ 
ALLKSSACFTLQELKLNNCGMGIGGGKILAAALT 
ECHRKSSAQGKPLALKVFVAGRNRLENDGATAL 
AEAFRVIGTLEEVHMPQNGINHPGITALAQAFAV 
NPLLRVINLNDNTFTEKGAVAMAETLKTLRQVE 
VINFGDCLVRSKGAVAIADAIRGGLPKLKELNLS 
FCEKRDAALAVAEAMADKAELEKLDLNGNTLG 
EEGCEQLQEVLEGFNMAKVLASLSDDEDEEEEE 
EGEEEEE^EEEEEEDEEEEEEEEEEEEEEPQQR^ 
QGEKSATPSRKILDPNTGEPAPVLSSPPPADVSTF 
LAFPSPEKLLRLGPKSSVLIAQQTDTSDPEKVVSA 
FLKVSSVFKDEATVRMAVQDAVDALMQKAFNS 
SFNSNTFLmLVHMQLLKSEDKVKAIANLYGP 
ENHNfVQQDYFPKALAPIXIAFVra'NSAI^ 
SCSFARHSLLQTLYKV 



I RPRPRGRRRALD SPNAAAPPV V VCKSFGbP l^L 
WMASEDIAKLAETLAKTQVAGGQLSFKGKSLK 
LNTAEDAKDVKEIEDFDSLEALRLEGNTVGVEA 
ARVIAKAL*KKSELKRCHWSDMFTGRLRTEIPPA 
USIXiEGUTAGAQLVELDLSDNAFGPDGVQGFE 
ALLKSSACF1LQELKLNNCGMGIGGGKILAAALT 
ECHKKSSAQGKPLALKVFVAQRNRI^GATAL 
AEAFRVIGTLEEVHMPQNGINHPGrrALAQAFAV 
NPLLRVINLW5NTFIEKGAVAMAETLKTLRQVE 
VINFGDCLVRSKGAVAIADAIRGGLPKLKELNLS 
FCEKRDAALAVAEAMADKAELEKLDLNGNTLG 
EEGCEQLQBVLEGFNMAKVLASLSDDEDEEEEE 
EGEEEEEMEEEEEEDEEEEEEEEEEEEEEPQQRG 
OGEKSATPSKKILDPNTGEPAPVLSSPPPADVSTF 
LAFPSPEKLLRLGPKSSVLIAQQTDTSDPEKWSA 
FLKVSSVFKDEATVRMAVQDAVDALMQKAFNS 
SSn^SNlFLTOLLVHMGLLKSEDKVKAIANLYGP 

Snhmvqqdyfpkalaplllafvttcpnsale 

srSFARHSLLQTLYKV 



FVRELPRPRPGAATAAIMVS VilM 1 VDISHLDMIH 
pAm>mwnTRLATCSSDRSVKIFDVBNGGQILIA 
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SEQD) 
NO: 


Method 


Preflicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Aianine C=Cystcine, D»A$partic Acid, 
E<=G]utamic Acid, F^Phenylalanine, G=Glycine, H»Histidine, 
I-Isoleuctne, K=Lysine, L^Leodne, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R==Arginine, S=Serine, 
T=Threoninc, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, ^possible nodeotide deletion, 
>FpossibIe nudeotide insertion 










DLRGHEGPVWQVAWAHPMYGNILASCSYDRKV 
nWREENGTWEKSHEHAGHDSSVNSVCWAPHDY 
GLILACGSSDGAISLLTYTGEGQWEVKKINNAHT 

GGCDNLIKLWKEEEDGQWKEEQKLEAHSDWVR 
DVAWAPSIGLPTSTIASCSQDGRVFIWTCDDASS 
NTWSPKLLHKFNDVVWHVSWSITANILAVSGGD 
NKVTLWKESVDGQWVCISDVNKGQGSVSASVT 

GKSPQLQQDYFPRRSYRCSHRLnCLNVIGDAL 


3561 


A 


540 


86 


WRVKEMTSTLPKALGRKTASRSHTTLQGGSCCP 

VLWTAKLRCRKLRFPLPPPPPSSSAWPWQGWGI 

RGEQEAEGPLGETGPPVGPELSGLRQWRKLIKGR 

YGEWRGSGQKTGQPS*TTMQGGETEENRTETTT 

GNKQRESEAPWVRHTYIT 


3562 


A 


1920 


242 


PMMAMPFFERFKSSIQRPSPVLVLSQNTKRESGR 

KVQSGNmAAKTIADIIRTCLGPKSMMKMLLDP 

MGGIVMTNDGNAILREIQVQHPAAKSMIEISRTQ 

DEEVGDGTTSVnLAGEMLSVAEHFLEQQMHPTV 

VISAYRKALDDNnSTLKKISIPVDISDSDMMLNIIN 

SSITTKAISRWSSLACNIALDAVKMVQFEENGRK 

EmiKKYARVEKn>GGIffiDSCVLRGVMINKDVTH . 

PRMRRYIKNPRIVLLDSSLEYKKGESQTDIEITRE 

EDFTRILQMEEEYIQQLCEDIIQLKPDWITEKGIS 

DLAQHYLMRAMTADUIVRKTDNNRIARACGARI 

VQTJPRT?! PPnnvnxr^AI^T T PrK'T?Tnn'PV17TT7TTnP 
V oiU^jQcJLlvJDJL/JLf V Ij 1 0 AvjJLiJ^JJvJvivj J^Cr X r i r 1 1 

KDPKACmLRGASKEILSEVERNFQDAMQVCRN 

VLLDPQLVPGGGASEMAVAHALTEKSKAMTGV 

EQWPYRAVAQALEVIPRTLIQNCGASTIRLLTSLR 

AKHTQENCETWGXOnTGETGTLVDMKELGIWEPL 

AVKLQTYKTAVETAVLLLRIDDIVSGHBCKKGDD 

QSRQGGAPDAGQE 


3563 


A 


1571 


560 


GPSLLGTRGTPNPARTLQIFFLnGRRLTGRMAAV 
DDLQFEEFGNAATSLTANPDATTVNIEDPGETPK 
HQPGSPRGSGREEDDELLGNDDSDKTELLAGQK 

KJ^IFVIU.YmSNPDLYGPFWICATLVFAIAISGNLS 

NFLIHLGEKTYHYVPEFRKVSIAATnYAYAWLVP 

LALWGFLMWRNSKVMNTVSYSFLEIVCVYGYSL 

FWPAVREDNRRVALATIV'nVLLHMLLSVGCLA 
YFFDAPEMDHLPTTTATPNQTVAAAKSS 


3564 


A 


1 


328 


NSRVDDFVAHLQRPLLGPASCLGILRPAMTAHSF 
ALPGnFTTFWGLVGIAGPWFVPKGPNRGVnTML 
VATAVCCYLFWLIAILAQLNPLFGPQLKNETIWY 
VRFLWE 


3565 


A 


2 


1081 


FVTDFPARSMAATSLMSALAARLLQPAHSCSLRL 

RPFHLAAVRZN/EAWISGRKLAQQIKQEVRQEVEE 

WVASGNKRPHLSVrLVGENPASHSYVLNKTRAA 

AWGmSETIMKPASISEEELLNLINKLNNDDNVD 

GLLVQLPLPEfflDERRICNAVSPDKDVDGFHVIN 

VGRMCLDQYSMLPATPWGVWEIIKRTGIPTLGK 

NVWAGRSK>fVGMPIAMLLH1DGAHERPGGDA 

TNH-ISPlRYTPKEQLKKHmADIVISAAGIPNLrrA 

DMKEGAAVDDVGINRVHDPVTAKPKLVGDVDF 
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"SEQID I Method 
NO: 



3566 



3567 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



1130 



248 



3498 



3568 



PCT/USOl/04098 

T mino acid sequence (A=Alamne C=Cysteine. l>=Aspartic Acm, 

I=lsoleudne,K=Lysiiie,L=LeudBe,M=Methiomiie, 

T=T^?eoninerv=ValiDe,W=To'ptoph«n.V=Tyroa^^^^ 
X=UDkn<wn, '"Stop codoi>,MposslbIe nucleotide deletiOD. 
V°possible nucleotide insertion 



EGVRQKAGVlTPVPGGVUl'MTVAMLMKNlllAA 



VT RT .REREVLKSKELGVATN 



SCRRGRQQQRi<NVSLSSQFAHlMAAPAWTl^P 



GGGKRKGKAQYVLAKRARRCDAGGPRQLEPGL 
OGnSSS^^CVEEAYSLLNEYGDDMYGPE 
^TOmQ^SGSEGEDDDAEAALKKEVGDIKA5 

SMRI^QSVBSGANNVVFIRTlJffiPE^^ 

LODMYKTKKKKTRVILRMLPISGTCKAFLEDI^ 

KYAETFLEPWFKAPNKGTFQIVYKSRNNSHVNR 

S^SSIVCTLNSENKVDLTOPQYT^™IIK 

^CCLSVVKDYMUTIKYNLQEVVKSPKDPSQLN 

sSSkeaklesadksdqnntaegknnqqvp 

ENTEELGQTKPTSNPQWNEGGAKPELASQAIE 



GSKSNENDFS 



50 



GKKDSSPWTCPFH PPLQLFFVIR>iTRQlAiUftJLA 
mVRl^ADGDLDIGAKNVKLYVNKNLmjG 
KU)KGDREAPADHSILVIX3KNEKSEQLEEAMNA 

sSkgtoemagasgdkelglgcsppaetlad 

^SSnvsgkrknstncrkdslsqleeylrls 

^smgdmpsapatsppvkcppvheepsliqql 

Slmgrkiceppgktpswlqpsptgkdrkqggr 

kpkplwlspekplawkgrlpsddvigegpgetea 

SSSgwgtsrsvntkerpqrattkvh^^ 

dsdSioppnrerpasgrrgsrkdagssshgddq 

?l?mmVSSRTPSRSRWRSEQEHTLHESWSSLS 
^StGRISNTELPGDILDELLQQKSSRHSDLP 

^Sgeqpglsroqdgysgetoaggdfkipvlpy 

GQSLtwJimYVGLNGffilFSSK^VQI 

sikADPPDINILPAYGKDPRVVTNLroGVNRTQ 

orivlHVWLAPFTRGRSHSITlDFrHPCHVALIBJW 

NYNKSR1HSFRGVKDITMLLDTQCIFEGEL«^SG 

TLAGAPEHFGDTILFTTDDDILEAIFYSDEMFDLD 

VGSLDSLQDEEAMRRPSTADGEGDERPFTQAGL 

GSS?LELPSSSPVPQVTIP^GrraG^^^^ 

FTASWGDLHYLGLTGLEWGKEGQJU.PCWIS 

ASPRDLNELPEYSDDSRTLDKLIDGTOnMEMH 

JSfspgldhvvtirldraesiaglr™wk 

SPEDTYRGAKIVHVSLDGLCVSPPEGFLIRKGPG 
NCHFDFAQEILFVDYLRAQLLPQPARRLra^LE 
CASMDYEAPLMPCGFIFQFQLLTSWGDPYYIGLT 
StlJS^EKIPLSENNIAAFPDSVNSLEGVGG 
DVRTPDKLroQVNDTSDGRHMWLAPILPGLVm 
VYVIFDLPTTVSMIKLWNYAKTPHRGVKEFGU, 
VDDLLVYNGILAMVSHLVGGILPTCEPTVPYirn 
LFIEDRDIRHQEKHTTISNQAEDQDVQMMNENQ 
^AgpifnsWDPALRPKTCISEKETRRRRC 



1724 



AOGGlLSAASRFCRGGLLGPWLHPASEMAAm) 
fKSKEEKDAELDKRIEALRBKNEALIRRYQ 

d^Sgvavtaprkgrsvekenvavesekn 

lS^pgttrppgaskggritpqqggragmg 

^sSqspgeqprgggaggrgrrgrgrgsph 

J^Ssisdrkskeweerrrqnie^eme 

Saeyernqregvlepnpvrnflddprrrsgple 

l^SiS^GRNWGGPDFERVRCGL^roR 
nr^^^SAGPN^Ti.SMTGRERSEYLRWKQER 
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SEQH) 
NO: 


Method 


Predicted 

beginniag 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-AIanine C>Cysteine, D^Aspartic Add, 
£=Glutaraic Add, F^Phenylaianine, G=GIyclnc, H^Histidine, 
I^'Isoleudne, K=Lysine, Lr=Lendne, M=Methionine, 
N=Asparagine, P=Prollne, <>=GlutamiDe, R^Arginine, S=Serine, 
T=Thrconine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X»Unknown, *»Stop codon, /=possibie nudeotide deletion, 
V=possibIe nucleotide lustriion 










EKIDQERLQRHRKPTGQWRREWDAEKTDGMFK 

DGPVPAHEPSHRYDDQAWARPPKPPTFGEFLSQ 

HKAEASSRRRRKSSRPQAKAAPRAYSDHDDRWE 

TKEGAASPAPETPQPTSPETSPKETPMQPPEIPAP 

AHRPPEDEGEENEGEEDEEWEDISEDEEEEEIEVE 

EGDEEEPAQDHQAPEAAPTGIPCSEQAHGVPFSP 

EEPLLEPQAPGTPSSPFSPPSGHQPVSDWGEEVEL 

NSPRTTHLAGALSPGEAWPFESV 


3569 


A 


1 


912 


MGRVGRAGVQLGRRRTTWAAERTGQAAAGGP 

GRALRGQRPDLRSGGAADSPAAGRGELYCGVLP 

RSPWFLSERRRQMADFDTYDDRAYSSFGGGRGS 

RGSAGGHGSRSOKELPTEPPYTAYVGNLPFNTV 

QGDIDAIFKDLSIRSVRLVRDKDTDKFKGFCYVE 

FDEVDSLKEALTYDGALLGDRSLRVDIAEGRKQ 

DKGGFGFRKGGPDDRGFRDDFLGGRGGSRPGDR 

RTGPPMGSRFRDGPPLRGSNMDFREPTEEERAQR 

PRLQLKPRTVATPLNQVANPNSAIFGGARPREEV 

VQKEQE 


3570 


A 


1 


912 


MGRVGRAGVQLGRRRTTWAAERTGQAAAGGP 
GRALRGQRPDLRSGGAADSPAAGRGELYCGVLP 
RSPWFLSERRRQMADFDTYDDRAYSSFGGGRGS 
RGSAGGHGSRSOKELPTEPPYTAYVGNLPFMTV 
QGDIDAIFKDLSrRSVRLVRDKDTDKFKGFCYVE 
FDEVDSLKEALTYDGALLGDRSLRVDIAEGRKQ 
DKGGFGFRKGGPDDRGFRDDFLGGRGGSRPGDR 
.RTGEPMGSRFRDGPPLRGSNMDFREPTEEERAQR 
'-pMjLKPRtVATPLNQVANPNSAiFGGARPREEV 
VQKEQE 


3571 


A 


28 


131 


RHFFGNLCAMRAKWRKXRMRRLK^^ 
RSK 


3572 


A 


3 


1202 


QSEPHRKVRVDPPVRDRPPPHPPPLLVQRALPGQ 

GQAEGSDGADGAKRRAMAHQTGIHATEELKEFF 

AKARAGSVRLKWIEDEQLVLGASQEPVGRWD 

QDYDRAVLPLLDAQQPCYLLYRLDSQNAQGFE 

WLFLAWSPDNSPVRLKMLYAATRATVKKEFGG 

GHIKDELFGTVKDDLSFAGYOKHLSSCAAPAPLT 

SAERELQQIRINEVKTEISVESKHQTLQGLAFPLQ 

PEAQRALQQLKQKMVNYIQMKLDLERETIELVH 

TEPTDVAQLPSRVPRDAARYHFFLYKHTHEGDP 

LESWFIYSMPGYKCSnCERMLYSSCKSRLLDSV 

EQDFHLEIAKKIEIGDGAELTAEFLYDEVHPKQH 

AFKQAFAKPKGPGGKRGHKRLIRGPGENGDDS 


3573 


A 


49 


1869 


PHCEPNPGAGAMVLLHVLFEHAVGYALLALKEV 

EEISLLQPQVEESVLNLGKFHSIVRLVAFCPFASS 

QVALENANAVSEGVVHEDLRLLLETmPSKKKK 

VLLGVGDPKIGAAIQEELGYNCQTGGVIAEILRG 

VRLHFHNLVKGLTDLSACKAQLGLGHSYSRAKV 

KFNX^VDNmQSISLLDQLDKDINTFSMRVRE 

WYGYHFPELVKIINDNATYCRLAQnGNRRELNE 

DKLEKLEELTMDGAXAKAILDASRSSMGMDISAI 

DLINIESFSSRVVSLSEYRQSLHTYLRSKMSQVAP 

SLSALIGEAVGARLIAHAGSLTNLAKYPASTVQIL 

GAEKALFRALKTRGNTPKYGLIFHSTFIGRAAAK 

NKGRISRYLANKCSIASRIDCFSEVPTSVFGEKLR 

EQVEERI^FYETGEIPRKNIJDVMKEAMVQAEAE 
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SEQW 

I NO: 



I Method 



Predicted 
I beginning 

nucleotide 
I location 

corresponding 

to first amino 

acid residne of 

peptide 

seqacncc 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



3574 



284 



2032 



3575 



"A 1 



2408 



3576 



1421 



A^Si^SlSdTequence (A«Alanine C=Cystcine, u-Asparnc /^wJ, 
7V«i«..i„i» Tf^Lvsinc, L=Leudne, M=Methionine, 

X=XJnknrn?*=Stop codon, H«»sslble nudeotide ddeton. 
\-possible nudeotide insertioii 



^^EECEETSEKPKKKKKQKPQEVPQENGM 
S^SmXKKSFSKEELMSSDLEETAGSTSIP 
g^SKEETVNDPEEAGHRSRSKKKRKFSKEE 

SKERTARLWVQPVVSllS gQASEHRLGRmbPP 



VMQPRVGSKLPFAPRARSKERm'ASGTmfl.R 
PLPPRPGLPDERLKKLELGRGRTSGPRPRGPLRA 
DHGVPLPGSPPPTVALPLPSRTNLARSKSVSSGDL 

Sg^lgghrgtgelgaalsrlalrpepptlrr 

^raSSwGFPGPPTLFSIRTEPPASHGSFHMISAR 
St?^S^miXGSGHVGLRNL(^^^^^^^^ 

lnavlqclsstrplrdfclrrdfrqevpgggra 

oelteafadvigalwhpdsceavnptrfravfq 

kyvpsfsgysqqdaqeflkllmerlhleinrrgr 

SSSgpwsppe^ggalleepelsdddranl 

S^Sdskivdlfvgqlksclkcqacgy 

StSwcdlslppkkgfaggkvslrdcfnlft 

SSsENAPVCDRCRQKTHSTKKLTVQRFPRI 
SwTsRGSnCKsivGVDFPLQRLSLGDF 

asdkagspvyqlyalcnhsgsvhyghytalcr 
Jq?g™wsrvspvsenqvassegyvlfyql 



^SsLPERlKPPYANGLSlSHLRSSS^bD s^k 

SegrpS^Wcsmpsvicehixqfqtiseesn 

oSsLLWGD-rePSPKPEVFSNVPERDLSNVSNffl 
sSS^TGASNSKYVSADRNLIKKrAPW™ 
SP^EPSSQVGVIQNKSWEl^VDRLETLSrapF 
IcSpDQESSLQSFCNSENK^a.KENADFLSLR 

oSpgnscaqdpas 

SSLSYVANQEPGILQQKNAVQnSSALDTD 

SronSOTVUJDVQKTOAFVPVYSDSTlQEA 

SwpSiYTLPVLPSEiaDFNGSDASTQLNTHYAF 

IS^ssgSvensttoiqvishekenklesl 

SSr?dsdlcemnagmpkgnlneqotkhc 

pWllsiedeesqqsilsslenhsqqstqp™ 

hkygolvkvelebnaeddktenqipqrmtrnk 

Stmanqskqilasctllsekdsesssprgrirlt 

^SSqihhprkrkvsrvpqpvqvspsllqaks: 

tooslamwslkldeiqpysseranpyfeylhir 

SS^Ssvipq^qyydeyvttogsylld 

SSnTPPPSLSDPLKELFRQQEVVRMKL 
SsffiREKLIVSNEQEVLRVHYRAARTLANQT 

S?Slldaevynvpldsqsddsktsvrdrf 

SoFMSWLQDVDDKFDKLKTCLLMRQQHEA 

S^qS:ewqlklqeldpatyksisiyeiqef 
yvplvdvnd dfeltpi 

TTOWHDGARWPLGiPRAAArRREAAAU'i'v i 
S^S^^LSSAENDFVHRIQEELDRFLLQKQ 

lskvllfpplssrlrylihrtaenfdllssfsvge 

Gm^TVICHQDIRVPSSDGLSGPCRAPASCPSR 
?SlSlQGAAAVPRGARAGRWYRGRKPDQ 
S^Sx^RRQEEWGLTSTSVLKREAPAGRDPEE 

SvgWnsdqglpvlmtqgtedlkgpgqr 

^!gST.r>PVGPEPLGPESQSGKGDMVEMA'raF 
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SEQW 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc C=Cysteine, I>=Aspartic Acid, 
E=Giutamic Acid, F=Phenylalaninc, G=Glycine, H==Histidine, 
l^lsoleudne, K=Lysine, L^Leudne, M'^Metfaionine, 
N=Asparagine, P^Proline, Q=GIutamine, R^Arginine, S^Serine, 
T-Threonine, V=VaIine, W=Tryptophan, Y^Tyrosinc, 
X=UnknowD, *==Stop codon, /=possible nndeolide deletion, 
\=possible nudeotide insertion 










GPSSCSEDDYSELLQEITDNLTBCKEIQffiBCIHLDTS 
SFMEELPGEKDIJ\HVVEIYDFEPAIJCTEDLLATF 
SEFQEKGFRlQWVDDTHALGIFPCRASAAEAL'm 
EFSVLKIRPLTQGTKQSKIJKALQRPKIXRLVKER 
PQTNAWARRLVARALGLQHKKKERPAVRGPLP 
P 


3577 


A 


102 


1998 


DTRTPGSLEMGPLQFRDVAIEFSLEEWHCLDTAQ 

RM.YRNVMLENYSNLVFLGIWSKPDLIAHLEQG 

KKPLTMKRHEMVANPSGPVICSHFAQDLWPEQN 

nCDSFQKVBLRRYEKRGHGNLQLIKRCESVDECK 

VHTGGYNGLNQCSTTTQSKVFQCDKYGKVFHK 

FSNS>mHNIRHTEKKPFKCIECGKAFNQFSTLITH 

BLKIHTGEKPYICEECGKAFKYSSALNTHKRIHTG 

EKPYKCDKCDKAFIASSTLSKHEIIHTGKKPYKCE 

ECGKAFNQSSTLTKHKKIHTGEKPYKCEECGKAF 

NQSSTLTKHKKIHTGEKPYVCEECGKAFKYSRIL 

TTHKRIHTGEKPYKCNKCGKAFIASSTLSRHEFIH 

MGKKHYKCEECGKAFIWSSVLTEUIKRVHTGEKP 

Y JvCcnCOJvAr Jv Y 1 JLooUKKoJtl 1 unKJr Y JvUisiiC 

GKAFVASSTLSKHEIIHTGKKPYKCEECGKAFNQ 

SSSLTKJiKKJHTGEKPYKCEECGKAFNQSSSL'IK 

HKJKIHTGEKPYKCEECGKAFNQSSTLIKHBCKIHT 

REKPYKCEECGKAFHLSTHLTTHKILHTGEKPYR 

CRECGKAFNHSATLSSHKKfflSGEKPYECDKCG 

KAFISPSSLSRHEIEHTGEKP 


-•3578"^": 


• A" 'trir' 


1725 .j:^ 


445. 


RPRRRGIHHFSeVLGSFRVSAMFPRVSTFLPLRP 

LSRHPLSSGSPETSAAAIMLLTVRHGTVRYRSSA 

LLARTKNNIQRYFGTNSVICSKKDKQSVRTEETS 

KETSESQDSEKENTKKDLLGIIKGMKVELSTVNV 

RTTKPPKRRPLKSLEATLGRLRRATEYAPKKRIEP 

LSPELVAAASAVADSLPFDKQTTKSELLSQLQQH 

RSRPELRIQFDEGYDNYPGQEKTDDLKKRKNIFT 
GKRLNIFDMMAVTKEAPETDTSPSLWDVEFAKQ 
LATVNEQPLQNGFEELIQWTKEGKLWEFPINNEA 
GFDDDGSEFHEHIFLEKHLESFPKQGPIRHFMELV 
TCGLSKNPYLSVKQKVEHmWFRNYFNEKKDILK 
ESNIQFKLRPWKFLFRNN 


3579 


A 


1725 


445 


RPRRRGTHHFSCVLGSFRVSAMFPRVSTFLPLRP 

LSRHPLSSGSPETSAAAIMLLTVRHGTVRYRSSA 

LLARTK^nS^IQRYFGTNSVICSKKDKQSVR^EETS 

ICETSESQDSEKENTKKDLLGIIKGMKVELSTVNV 

RTTKPPKRRPLKSLEATLGRLRRATEYAPKKRIEP 

LSPELVAAASAVADSLPFDKQTTKSELLSQLQQH 

PPPCT? A nP n A W Pli^TQP^JXFTTCnA/IW AUQATAPV 
llJ2(XioXS^V^i\J-//Vl\JNJ^ISJLor OiN UoiJiVLIv Y /Vivo A 1 Alv V 

RSRPELRIQFDEGYDNYPGQEKTDDLKKRKNIFT 
GKRLNIFDMMAVTKEAPETDTSPSLWDVEFAKQ 
LAT\rNE0PL0NGFEELI0WTECEGKLWEFPIN>3GEA 
GFDDDGSEFHEHBFLEKHLESFPKQGPIRHFMELV 
TCGLSKNPYLSVKQKVEfflEWFRNYFNEKKDILK 
ESNIQFKLRPWKFLFRNN 


3580 


A 


3673 


1619 


LYCVAPYSRHLLGRMSHLPMKLLRKKIEKRNLK 
LRQRJO.KFQGASNLTLSETQNGDVSEETMGSRK 
VKKSKQKPMNVGLSETQNGGMSQEAVGNIKVT 
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SEQ ID ^ 
NO: 


[ethod P 
b 
n 
l( 
c 
t 
a 
f 


eginning n 
ucleotide Ic 
>cation c 
orresponding t< 
[) first amino a 
cid residue of p 
eptide » 


redicted end A 
udeotide £ 
cation I' 
orresponding ^ 
> last amino 1 
cid residue of ^ 
eptide V 
eqnence 


mino acid sequence (A«Alaninc C^tysteinc, »=Asparirtc Acia, 
>=GlirtainicAcid,F=FhenylalaniDe,t,=oiycine, ti iwsuuu , 
-Isoleucine.K=Lysine,L^LeiidDe,M=Methlonine, 

C^?e«Bine,V=Valine,W=Tryptophan,T^=Tyro«^^^^^ 
Wnknown, *=Stop codon, A=posslble nucleotide deletion, 
^possible nudeotide insertion 




s 


cquence 


] 
1 

1 

■ 


CSPOKSTVLl'NGKAAMQSSNStSKKJiJUUjJJik 
ymiDAEPDTKKAKTENKGKSEEESAETTKET^ 
™CPDNDEDESEVPSLPLGLTGAFEDTSFASLC 
[JLVNENTLKAIKEMGFTNMTEIQHKSIRPLLEGR 

?g^;^Slamqttgvi^mi™^ 

MGGSNRSAEAQKLGNGINIIVATTCR^HM^ 
TPQFMYKNLQCLVIDEADRILDVGFEEELKQIIKL 

StrSSfsatqtricvedlarislkkeplyvg 

^roSlSrVDGlEQGYVVCPSEKRPLLLmL 

^Sklmvffsscmsvkyhyellotid^^^^^ 

AfflGKQKQNKRTTnTQFCNADSGmCTDVAA 
RGLDIPE^SwIVQYDPPDDPK£YmVG™^^ 
NGRGHALULRPEELGFLRYLKQSKVPLSEFOTS 
WSMSDIQSQLEmEKNYFLHKSAQEAYKS-CT^ 


3581 


A 


23 


453 


LCRCICIKNITTHCLWDKVLSgF 1 ^'l^ii^^^^i'^r 
HHPHSLRNSCLIRMDLLYWQFTTYTnFCFSffl^G 
W^TLSAQHISHRPCLLSYSLLFWKVHHLFLEGFPC 

SShqfpqhpvhvsvvhlpivykgsmt 


3582 


A 


3 


950 


TOGCGNKMAOKklvl VLi.SLAVYAEDSJiPESDGEA 

GffiAVGSAAEEKGGLVSDAYGEDDFSRLGGDH) 

GYEEEEDENSRQSEDDDSETEKPEADDPKDNTC 

AEKKDPQELVASFSERVRNMSPDEIKIPPEPPtmC 

SQDKIQKLYERKIKEGMDMNYnQRKK]^ 

SrYEmQFCAIDELGTNYPKDMFDPHGWSEDS 

^gSAQKIEMDKLEKAKKERTlOEFVTCm 

KGTTmATSTITITASTAVADAQKRKSK^\^M 

PVTTlAQPmTITATLPAVVTVTrSASGSKTTVIS 

AY<^Tnn<:KAKO 


3583 


A 


3 


950 


- -TOGC«MckNVLSSLAV YAEDSEPESPa^A 
GIEAVGSAAEEKGGLVSDAYGEDDFSRLGGDH) 
GYEEEEDENSRQSEDDDSETEKPEADDPKDNTC 
XEKKDPQELVASFSERVRNMSPDEIKIPPEPPaRC 
SNfflS)KIQKLYERKIKEGMDMNYnQRKKEFRN 
PSrJSlQFCAroELGTOYPKDMFDPHGWSEDS 

^Skiemdklekakkerikiefvtg^ 

KGTTTOATSTnTTASTAVADAQKRKSKWDSAI 
AVrjTTVKKAKO 


3584 


A 


3 


1139 


- PGSTISSRADRLGAPVLAHPKh4AJiKQt!EQRGSPP 
LRABsS^AEVKLILYHWraSFSSQKVRLVlAE 
KAIXCEEHDVSLPLSEHNEPWFMRLNSTGEVPV 
LfflGENIICEATQIIDYLEQTFLDERlPRLMPDKES 
tSYPRVOHYBELLDSLpWYTHGCILHPELTV 

Saya^rsqignteselkklaeenpdlqe 

^TCimNEETPEEGQQPWLCGESFTl^VS 
LAmHRLKFLGFARWWGNGKRPNLEmERV 
uSKWvLGHVNNILlSAVLFrAFRVAKHRAP 
KvSriLVVGLLAGVGYFAFMLFRKRLGSMlLA 

LRPRPNYF . 
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SEQID 
NO; 


Method 


Predicted 

begioniog 

Ducleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanine C=Cysteine, D^'Aspartic Add, 
£<=Glotamic Add, F^Phenylalanfne, G=Glydne, H^Histidine, 
I=^Isoleodne, K-Lysine, Lr=Leudne« M^Metfaionine, 
N=Asparagine, P=Pro!ine, Q==Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptoplian, Y«T>rosine, 
X==Unknown, ^'^^Stop codon, /^possible nudeotide deletion, 
V=po5sible nudeotide insertion 


3585 


A 


1 


1777 


RRHSPGSPAFAPSSRATAICPRAARAPATLLLALG 

AVLWPAAGAWELTILHTNDVHSRLEQTSEDSSK 

CVNASRCMGGVARLFTKVQQIRRAEPNVLLLDA 

GDQYQGTIWFTVYKGAEVAHFMNALRYDAMA 

LGNHEFDNGVEGLEEPLLKEAKFPILSANIKAKGP 

LASQISGLYLPYKVLPVGDEWGIVGYTSKETPF 

LSNPGT^^:.WEDEITALQPE\aDKLKTLlm^M 

GHSGFEMDKLIAQKVRGVDWVGGHSNTFLYT 

GNPPSKEVPAGKYPFJYTSDDGRKVPWQAYAF 

GKYLGYLKIEFDERGNVISSHGNPILLNSSPEDPS 

IKADINKWRIKLDNYSTQELGKTrm.DGSSQSC 

IVr IvDx^lN Xvl VJIN i-»l^l-//VlYU,rN IN IN JUISXl 1 J^dVJJT W IN n V o 

MCILNGGGIRSPIDERNNGTITWENLAAVLPFGG 
TFDLVQLKGSTLKKAFEHSVHRYGQSTGEFLQV 
GGIHWYDLSRKPGDRWKLDVLCTKCRVPSYD 
PLKMDEVYKVILPNFLANGGDGFQMIKDELLRH 
DSGDQDINVVSTYISKMKVIYPAVEGRIKFSTGS 

HCTTG^F^T TFT ST WAVTTA/T YO 


3586 


A 


1399 


881 


LSNKDVLSPQLKDENSKLRRKLNEVQSFSEAQTE 

MVRTLERKLEAKMIKEESDYHDLESVVQQVEQN 

LELMTTOUVKAENHVVKLKQEISLLQAQVSNFQ 

RENEALRCGQGASLTWKQNADVALQNLRVVM 

NSAQASffiQLVSGAETLNLVAEILKSIDRISEVKD 

FEEDS 


3587 


A 


88 


1639 


GCVGRGLPLPPRHPTPPSSSSSPFVLLAFLLLVRL 
DPAVSGKMAAPRPPPARLSGVMVPAPIQDLEAL 

viELACRDPSQVENLASSLQLITECFRCLRNACIEC 

SVNQNSIRNLDTIGVAVDLILLFRELRVEQESLLT 

AFRCGLQFLGNIASRNEDSQSIVWVHAFPELFLS 

CLNHPDKKIVAYSSMILFTSLNHERMKELEENLN 

lAIDVIDAYQKHPESEWPFLnTDLFLKSPELVQA 

MFPBCLNNQERVTLLDLMIAKITSDEPLTBCDDPVF 

T TlTfAFT TA^TFVnnnK'TVT 'k'T ASFFPPnnFFAT A 

TIRLLDVLCEMTVNTELLGYLQVFPGLLERVIDL 

LRVIHVAGKETTNIFSNCGCVRAEGDISNVANGF 

KSHLIRLIGNLCYKNKDNQDKVNELDGIPLILDN 

O^SDSNPFLTQWVIYAIRNLTEDNSQNQDLIAK 

MEEQGLADASLLKKVGFEVEKKGEKLILKSTRD 

TPKP 






3588 


A 


3 


1462 


DSPRNRFEILGRPTRTPTRPGPRPAMEDLDALLSD 

LETTTSHMPRSGAPKERPABPLTPPPSYGHQPQT 

GSGESSGASGDKDHLYSTVCKPRSPKPAAPAAPP 

FSSSSGVLGTGLCELDRLLQELNATQFNITDEIMS 

QFPSSKVASGEQKEDQSEDKKRPSLPSSPSPGLPK 

ASATSATLELDRLMASLSDFRVQNHLPASGPTQP 

PWSSTNEGSPSPPEPTGKGSLDTMLGLLQSDLSR 

RGVPTQAKGLCGSCNKPIAGQWTALGRAWHPE 

HFVCGGCSTALGGSSFFEKDGAPFCPECYFERFSP 

RCGFCNQPIRHKMVTALGTHWHPEHFCCVSCGE 

PFGDEGFHEREGRPYCRRDFLQLFAPRCQGCQGP 

ILDNYISALSALWHPDCFVCRECFAPFSGGSFFEH 

EGRPLCENHFHARRGSLCATCGLPVTGRCVSAL 

GRRFHPDHFTCTFCLRPLTKGSFQERAGKPYCQP 

CFLKLFG 
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SEQID 
NO: 



Method 



3589 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 



226 



Predicted enO 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 

1793 



PCTAISOl/04098 

Amino acid sequence (A-^AIanme C=C ysteine, "°Aspar.^ /^.:.». 
Sttmic Acid, P=Phenylaianine, G=Glyctae, H=Histidme, 
I=bolencine, K=Lysine, I^Lenclne, M=Meth.omiie, 
N=Asparagine, P=Proline, (HJIutom"*. ^=1'^""'' 
T=XoninerV=ValiBe,W=Tn'ptophan,V-Ty^^^ 
X=Unknown, *=Stop codoii,/=posslble nndeottde delehon, 
V^possible nucleotide insertion 



SPPKKSRKCNLSFRLISAERWRFFLLILMEMFKJSJ' 



laTLFVQRBIENIATEREFDPEEFYYLLEAAEGHA 
KEGOGIKTDffRYnSQLGLNKDPLEEMAHLGNY 
^GTAETPETOESVSSSNASLKLRRKPRESDFETI 
KLISNGAYGAVYFVRHKESRQRFAMKKINKQNL 

Srnqiqqafverdiltfaenpfvvsmycsfetrr 
hS^yveggdcatlmknmgplpvdmarm 

YFAETVLALBYLHNYGIVHRDLKPDNLLVTSMG 
HIKLTOFGLSKVGLMSMTTNLYEGHDBKDAMFL 
DKQVCQTPEYIAPEVILRQGYGKPVDWWAMGII 

l™.vgcvpffgdtpeelfgqvisdeinwpekde 
apppdaqdlthllrqnplerlgtggayevkqhr 
JSldwnsllrqkaefipqleseddtsyfdtrse 
kyhhmeteeeddtndedfnveirqfsscshrfsk 

WSSroRITXJNSAEEKEDSVDKTCSTTLPSTEU^ 

wsseysemqqlstsnssdtesnrhklssgupkl 
aistegeqdeaascpgdpheepgkpalppeecaq 
eepevtitastissstlsvgsfsehldqingrsecv 
dstdnsskpssepashmarqrlestekkkisgkv 

TKSLSASALSUflPGDMFAVSPLGSPMSPHSLSSD 

psssrdsspsrdssaasasphqpivihssgknygft 



titpfentsiktgparknsyksrmvrrskkskkk 
eslerrrslfkklakqpspllhtsrsfsclnrsls 

ssspsssapnspagsghirpstlhglapklggqry 

SSSlPLSPIARTPSFITQPTSPQ^^^^^^ 
LGHSLGNSKIAQAFPSKMHSPPTIVRHr^J^AE 

pprspllkrvqseeklspsygsdkkhlcsrkibl 

EVTQEEVQREQSQKEAPLQSLDENVCDVP^^^ 

rpveqgclkrpvskkvgrqesvddldrdklkak 

VWraCADGFPEKQESHQKFHGPGSDLENFALFK 

leerekkvypkaversstfenkasmqeapplgsl 

LKDALHKQASVRASEGAMSIXiPWAEmQGGG 

dfreapapgtlqdglchsldrgisgkgegtekss 

o5^LRCEKLDSKLANIDYLRKKMSLEDKEDN 

£cpvlkpkmtagsheclpgnpvrptggqqepppa 

seSafvsstoaaqmsavsfvplkaltgrvdsgt 

ekpglvapespvrkspseyklegrsvsclepiegt 

ldiallsgpqasktclpspesaqspspsgdvrasv 

ppviJsssgkki©ttsarelspsslkmnksyllep 

wflppsrglqnspavslpdpefkrdrkgphptar 

rPGTVMESNPQQREGSSPKHQDHTTDPKLLTCLG 
qSsroLABiRCPLPPEASPSREKPGLRESSE^^ 

ppt^ersaaradtcrepsmelcfpetaktsdn 

SKm-mGRTOPDFYTQTQAMEKAWAPGGKTN 
Kra)GPGEARPPPRDNSSLHSAGIPCEKELGKVRR 

Svepkpeallarrslqppgibsekseklssfpslq 

KDGAKEPERKEQPLQRHPSSIPPPPLTAKDLSSPA 

SqSpshasgrh-gakpstaepssspqdpp^ 

VAAHSESSSHKPRPGPDPGPPKTKHPDRSLSSQK 

psvSatkgkepatqslggssregkghsksgpdvf 

PATPGSQNKASDGIGQGEGGPSVPLHTDRAPLDA 
j^rMyrsr^RPLEVLEKPVHLPRPGHPGPSEPADQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
npudeotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A<=AIanine C=Cysteine» I^^Aspartic Add, 
EMxIutamic Add, F=PhenyIalanine, G=Glydne, H==Histidine, 
I-Isoleudne, K=Lysine, L^Leucine, M=iVfethionine, 
N=Asparagine, P=Prollne, Q=Glutaminc, R=Argiaine, S=Serine, 
T=Threoninc, V=Vaiinc, W=Tryptophan, Y=Tyrosinc, 
X^Unknown, *=5top codon, /'possible nudeotide deletion, 
V^possible nucleotide insertion 










KLSAVGEKQTLSPKHPKPSTVKDCPTLCKQTDN 

RQTDKSPSQPAANTDRRAEGKKCTEALYAPAEG 

DKLEAGLSFVHSENRLKGAERPAAGVGKGFPEA 

RGKGPGPOKPPTEADKPNGMKRSPSATGOSSFRS 

TALPEKSLSCSSSFPETRAGVREASAASSDTSSAK 

AAGGMLELPAPSNRDHRKAQPAGEGRTHMTKS 

DSLPSFRVSTLPLESHHPDPNTMGGASHRDRALS 

VTATVGETKGKDPAPAQPPPARKQNVGRDVTKP 

SPAPNTDRPISLSNEKDFWRQRRGKESLRSSPHK 

KAL 


3590 


A 


3 


935 


RATTRPKNEVQDYVSVEYLSPHMGGTDPFKYSY 

PPLVDDDFQTPLCENGPITSEDETSSKEDIESDGK 

ETLETISNEEQTPLLKKINPTESTSKAEENEKVDS 

K VK A FKKPLS VFK GPLLHTSPAEELYFGSTESGEK 

KTLIVLT^mXNIVAFKVRTTAPEKYRVKPSNSS 

CDPGASVDIVVSPHGGLTVSAQDRFLIMAAEME 

OSSGTGPAELTOFWKEVPRl^VMEHRLRCHTVE 

SSKPNTLTLKDNAFNMSDKTSEDICLQLSRLLES 

NRKLEDQVQRCIWFQQLLLSLTMLLLAFVTSFFY 

LLYS 


3591 


A 


303 


2 


GGSWGPLCPVSPAMSLSDPGLGYHPTCWTLRWP 

PLCSLHALHVFHCLFSSRLGTPVSPRLAMDPNCS 

CEAGGSCACAGSCKCKKCKCTSCKKSCCSCCPL 


3592 


A 


1052 


1779 


GKTMMRKMLLAAALSVTAMTAHADYQCSVTP 
RnnvTvspnwovKGFMG'Krr vttpdgktvmymgk 

QYSLNAAQREQAKDYQAELRSTLPWIDEGAKSR 

VEKARIAtt)KIIVQEMGlBSSKMRSRLtKLDAQN^ 

EQMNRIIETRSDGLTFHYKAIDQVRAEGQQLVNQ 

AMGGILQDSINEMGAKAVLKSGGNPLQNVLGSL 

GGLQSSIQTEWKKQEKDFQQFGKDVCSRWTLE 

DSRKALVGNLK 


3593 


A 


3 


1837 


LSFEKVDIQTDNDLTKEMYEGKENVSFELQRDFS 

QETDFSEASLLEKQQEVHSAGNIKKEKSNTIDGT 

VKDETSPVEECFFSQSSNSYQCHHTGEQPSGCTG 

LGKSISFDTKLVKHEIINSEERPFKCEELVEPFRCD 

SQLIQHQENNTEEKPYQCSECGKAFSINEKLIWH 

QRLHSGEKPFKCVECGKSFSYSSHYITHQTIHSGE 

KPYQCKMCGKAFSVNGSLSRHQRIHTGEKPYQC 

KECGNGFSCSSAYITHQRVHTGEKPYECNDCGK 

AFNGNAKLIQHQRIHTGEICPYECNECGKGFRCSS 

QLRQHQSIHTGEKPYQCKECGKGFNNNTKLIQH 

QRIHTASLAEQLFKASGl^NWGCCLTISSPGPS 

VYGPKMNMRGAPNSRLAGGREKRTQDTDFGQC 

SFLPSHSPSCFEPWhrNTTDYDSSWYROKOVLSGV 

t^X J^X tJX Xwrx U wX X^X Vt 1^ V X 1^ X Xi/V^Ul W X XW^jnk.>^ V JL/0\J V 

WSSPLSILKLPRTLIRISIfflQEMDTPGEMLMTGR 

GSLGPTLTTEAPAAAQPGKQGPPGTGRCLQAPGT 

EPGEOTPEGARELSPLOESSSPGGVKAEEEORAG 

AEPGTRPSLARSDDNDHEVGALGLQQGKSPGAG 

NPEPEQDCAARAPVRAEAVRRMPPGAEAGSWL 

DD 


3594 


A 


39 


261 


RAAMMDTSRVQPIKLAIVIKVLGRTGSQGQCTQ 

VRVEFMDDTSRSIIRSVKGPVREGDVLTLLESERE 

ARRLR 


3595 


A 


973 


68 


GRVGTKHQMADDAGAAGGPGGPGGPGMGNRG 
GFRGGFGSGIRGRGRGRGRGRGRGRGARGGKAE 
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NO: 



3596 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
add residue of 
peptide 



Predicted end 
nucleotide 
locathm 
corresponding 
to last amino 
acid residne of 
peptide 
sequence 



3597 



3598 



3599 



277 



Aoii.0 acid sequence (A °Alaniue C=Cysteine, D=Asparflc Acid, 
^«teSd,F-nenylalanine.G==Glycin^^^^^^ 
I=lsoleuclne,K=Lysine.I^I^uane,M=MeA»^^^^ 
N=AsDararine, P=Proline, Q=Glutamine, R=Argimne, S=Senne, 
?=^Ke,V=Val.ne,W=T:TPtophan,Y=T^^^^^^^^ 
X=Unknown, *=Stop codon, A=posslble nudeohde deletaon, 
V^possible nudeotide Insertion 



3907 



SETTOFFLGASLKDEVLlOMPVQKQTTlAGQRra' 
SSvAIGDYNGHVGLGVKCSKEVATAIRGAIILA 
K^r^VRRGYWGNKlGKPHTVPCKVTGRCGSV 
S^ffSGTCIVSAPVPKKLLMMAGIDDCYTS 

aKa^gnfakattdaisktysyltpdlwke 

^??SYQEFIDHLVKraTOVSVQRTQAPAVA 



Dffl^YLYHVLTKNTTVTEQNRNLLVETIRSITEIL 

IWGDQNDSSVFDFFLEKNMFWFLm,RQKSGRY 

VCVQLLQTT.NILFEOTSHETSLYYIXSN>2f^ 

VHKTOFSDEEIMAYYISFLKTLSUCU^rar;^ 

Yl^TNDFALYTEAIKFFNHPESMVWAVRTITL 

™VSLDNQAMLHYIRDKTAVPYFSNLJ^^ 

SHVIELDDCVQTOEEHRNRGKLSDLVAEHLDHL 

55SVeflndvltdhllnrlflplyvysl 
^Sgg^kislfvslyllsqvfliihhaplw 

SmNGDLSEMYAKTEQDIQRSSAKPSIRCFI 

S>tcSlemnkhkgkrrvqkri>ny^gee 
SLkgitedaqedaekakgteggskgk^^ 
eeiemvimersklselaastsvqeqnttdeeksa 
Scsestowsrpfldmvyhaldspdddyhalf 

OJcXLYAMSHNKGMDPEKLERIQLPVPNAAm 

tynhplaerlirimnnaaqpdgkirlatlelscl 

SoQA^MSAGCIMKDVHLACLEGAREESVHLy 
iS?KSSFLDMFEDEYRSMTmPNQ^EYLM 

JSASILLPPTGmTGroFVKM.PCGnvnE^I 
RVFFMLRSLSLQLRGEPETQLPLTREEDLKTDDV 

LDLNNSDLIACTVITKDGGMVQRSLAVDIYQMS 

t?iPDVSRLGWGVVKFAGLLQDMQVTGV^DS 

iALNmHKPASSPHSKPFPILQATTIFSD^nAK 

OWAKORIQARKMKMQRIAALLDLPIQPTTEVLG 

?GL^^TOTaSTOQGRRGSSDPTVQRSVF 

c^.cnQT^Hr.nsfiGTSSSSTPSTAQSPAGIGHyTQ . 
GVRRIQHHW AQMHECNVHlYASLFCLJLtJrrG 



ia.CCLNSHR HFHCIKYSK 

Ea'RIXKATAMn.EHYLDSlENU'tbLC^lUs^gL 

SDQRTEDKKAElDILAAEYISTVKTLSroQR 

^SSlqmqnayskckeysddkvqlamqtce^ 

SS^DADLABPEADUCDKl^E^GGR 
GLKKGRGQKEKRGSRGRGRRTSEEDIPKKKKH 

KGG 



SSLGESJLFroVACGRGKKADSWCrrSSG 
wcSsDmiiKWVELRVYPEVKDSNQACLPP 

S^OTQALLDTELPGGDKADASLLDPRVGI 
SvGvS^GQHIASGDKMGTLRVHELQSLSEh^ 
K\SSllX;LEYSKPDTGUCIiASASRDRLIH 
™^;^cT ^nTf nPHSSSTrAVKFAASDGQVR 
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SEQm 
NO: 


Method 


Predicted 

begianing 

nucleotide 

locatioa 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanlne OCystcine, D=Aspartic Add, 
EX^lutamic Acid, F=PhenyIalanine, G=Glycine, H-Hlstidine, 
I=Isoleudne, K^Lysine, L^Leudne, M^Methionine, 
N^Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Thrconine, V=Valinc, W=Tryptophan, Y=Tyrosine, 
X»lInknown, *«Stop codon, ^possible nodeotide deletion, 
\=possible nucleotide insertion 










MSCGADKSIYFRTAQKSGDGVQFTRTHHVVRK 

TTLYDMDVEPSWKYTAIGCQDRNmiFNISSGKQ 

KKLFKGSQGEDGTLIKVQTDPSGIYTATSCSDKNL 

SIFDFSSGECVATMDFGHSEIVTGMKFSNDCKHLIS 

VSGDSOFVWRLSSEMTISMRQRLAELRQRQRGG 

KQQGPSSPQRASGPNRHQAPSMLSPGPALSSDSD 

KEGEDEGTEEELPALPVLAKSTKKALASVPSPAL 

PRSLSHWEMSRAQESVGFLDPAPAANPGPRRRG 

RWVQPGVELSVRSMLDLRQLETLAPSLQDPSQD 

SLAnPSGPRKHGQEALBTSLTSQNEKPPRPQASQ 

PCSYPHURLLSQEEGVFAQDLEPAPIEDGIVYPEP 

SDMPTMDTSEFQVQAPARGTLGRVYPGSRSSEK 

HSPDSACSVDYSSSCLSSPEHPTEDSESTEPLSVD 

GISSDLEEPAEGDEEEEEEEGGMGPYGLQEGSPQ 

TPDQEQFLKQHFETLASGAAPGAPVQVPERSESR 

SISSRFLLQVQTRPLREPSPSSSSLALMSRPAQVPQ 

ASGEQPRGNGANPPGAPPEVEPSSGNPSPQQAAS 

VLLPRCRLNPDSSWAPKRVATASPFSGLQKAQS 

VHSLVPQERHEASLQAPSPGALLSREIEAQDGLG 

SLPPADGRPSRPHSYQNPTTSSMAKISRSISVGEN 

LGLVAEPQAHAPIRVSPLSKLALPSRAHLVLDIPK 

PT PDTlT>TLAAFSPVTICGRAPGFAEKPGFPVGLGK 

AHSTTERWACLGEGTTPKPRTECQAHPGPSSPCA 

QQLPVSSLFQGPENLQPPPPEKTPNPMECTKPGA 

ALSQDSEPAVSLEQCEQLVAELRGSVRQAVRLY 

HSVAGCKMPSAEQSRIAQLLRD^TFSSVRQELEAV 

-AGAVLSSPGSSP0AVGAEQTQALLEQYSELLLRA 

VERRMERKL 


3600 


A 


1688 


916 ^ 


IPGSTISCSMALCEAAGCGSALLWPRLLLFGDSIT 

OFSFOOGGWGAST ADRT VRKCDVLNRGPSGYN 

TRWAKHLPRLIRKGNSLDIPVAVTIFFGANDSAL 

KDENPKOHIPLEEYAANLKSMVOYLKSVDIPENR 

VILITPTPLCETAWEEQCIIQGCKLNRLNSVVGEY 

ANACLQVAQDCGTDVLDLWTLMQDSQDFSSYL 

SDGLHLSPKGNEFLFSHLWPLIEKKVSSLPLLLPY 

WRDVAEAKPELSLLGDGDH 


3601 


A 


44 


223 


VHFPLIPQLAKCFWTMNRAARNKSEKRYYSEFL 
QIAHLFNYGLSSFLREFIIFLIKLLQ 


3602 


A 


37 


1124 


VPKPASGKRRLEFRPQDSKACAATPHSPGRITSR 

TRGSQKVRSVPPRLPWAQASASTDWEGLRGVPG 

PALRRENFLEAAASGRSGRTPTGGVGFRDVGGP 

HFPIFPAAHFLWCNLHTPRRPACNAPWHSPVGEl 

SPPPRESOLRRDPEVHFESPAHPLGFRLLPGRGLP 

ANAVTVETAAMAAPRQIPSHIVRLKPSCSTDSSF 

TRTPVPTVSLASRELPVSSWQVTEPSSKNLWEQI 

CKEYEAEOPPFPEGYKVKOEPVrrVAPVEEMLFH 

GFSAEHYFPVSHFTMISRTPCPQDKSETINPKTCS 

PKEYLETFIFPVLLPGMASLLHQAKKEKCFEVVL 

QMTPSGGKACVWGHLPSSSHTI 


3603 


A 


286 


587 


NISNKAEVSSHPSVISHSMDSFGQPRPEDNQSVLR 

RMQKKYWKtKQVFIKATGKKEDEHLVASDAEL 

DAKLEVFHSVQETCTELLKIIEKYQLRLNGMKS 


3604 


A 


103 


2440 


QPRRRVFPAAGRGPGRKCSQWGRQASVSFEDVT 

VDFSKEEWQHLDPAQRRLYWDVTLENYSHLLS 

VGYQIPKSEAAFKLEQGEGPWMLEGEAPHQSCS 
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SEQID M 
NO: 


ethod Fi 
b( 
m 
lo 

C4 
t€ 

1 ^ 
P 

Is 


rcdicted 

adeotidc lo( 
cation 

>rrcsponding to 
> first amino a( 
cid residue of p< 
eptlde S( 
eqnence 


edicted enci A 

irlpntide Ttr 

cation 1= 
rresponding N 
last amino T 
:id residue of X 
tptide V= 
:queDce 

C 


^no acid sequence (A=Alanine C^Cysteine, u=Asparuc Acid, 

XalAdd%=Phe^^^^^^^ 

T «1.7oSnP K-^Lvsinc, L^Lcncine, M^Methionine, 

=Threonine, V=VaUi«, W=Tiypfoph«n,-l^^a«^ 
=Unknown. *=Stop codon, ^ssible B«deotide delebon, 
^sible Budeotlde Insertion 

rEAIGKMQWUlFGGIFFHCEK^lXiPIGEDSLCSI 

SSaticnlgkifgngnnfphspsst^ 

aKSNKVFPQKPQVDVHPSVYTGEKPYLCTQCGK 

5fSSiS}Kihtgqkpykcsecgkaffqrs 

DiSSTGEKPYECSECGKGFSQNSDLSfflQ 

S^SaWcNGCGKAFIWKSRLKIH 
PKPYVCPECGKAFIQKSHFIAHHRIHTGEKPYEC^ 

toS^^i^treoyecgdcgktftwksrl 
htgkkpyactecqkaftdrsnlikhqkmhsgbk. 

■I^^GKGGKGimGGAKRHRKVLM) 








: 


3605 


A 


3 


322 


TSSSWGGVmSGLIYEETRGVLKVFL©! 

vS^^^Shakrktvtamdvvyalkrqgrt 


3606, 


A 


1^ - * 


1749 . 


- WVTAEAKLMUFTQGCVTFbU V Al > I-SQEEWUJL 

SS^SS™eotalitalvcwhgmede 
e^^q^sVegvpqvrtpeasps 

PFLTOE.HLTDLPGQELYLTGACAVFHQIX2KHHS 

aSesdmdkasfvqcclfhesgmpftssevg 
^^SSlgSqaianyekpnkiskceeafhvgi 

SwSQ^SSHKHTFFHPRVCTGKRLYESS 

macSecslvqlqrvhpgerpyecsecgks 

fSSSvECGECGKSFSQSSNLIEHCRI 

JSS^ecdeSafgskstlvehqrthtgek 

--T^^^DPDEQYDFLFKLVLVGDASVGkl 


3607 


A 


92 


331 


^^Ql^SpSERQGSlIGVDFTmTmQGKR 

-■^^SyIOTasngrmMvtm- 


3608 


A 


545 


379 




3609 


A 


118 

1 


873 


SSGCLVSLSYHLDTKVRCPMCWQVn^S 
^nAOGTRERLAOAECVLEQF 
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SEQIJD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequeoce 


Amino acid sequence (A-Alanine C=Cysteine, D^Aspartic Acid, 
£=GIutamic Acid, f^Phenylalanine, G=Glycine, HsHistidiDe, 
I=l5oIeucine, K^Lysinc, Lr=LfCucinci Methionine, 
N'^Asparagine, P=ProIine, Q=GIutamine, R=Arginine, Serine, 
T=Threomne, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X"Unknown, *^top codoo, possible nucleotide deletion, 
V^ossiblc nucleotide insertion 










GNEDHHEFIWKFHSMASR 


3610 


A 


2 


987 


DPRVRPPLLQPPPPLLPRLVILKMAPLDLDKYVEI 

ARLCKYLPENDLKRLCDYVCDLLLEESNVQPVS 

IPVTVCGDIHGQFYDLCELFRTGGQVPDTNYIFM 

GDFVDRGYYSLETFTYLLALKAKWPDRITLLRG 

NHESRQITQVYGFYDECQTKYGNANAWRYCTK 

VFDMLTVAALIDEQILCVHGGLSPDIKTLDQIRTl 

ERNfQEIPHKGAFCDLVWSDPEDVDTWAISPRGA 

GWLFGAKVTNEFVHINNLKLICRAHQLVHEGYK 

FMFDEKLVTVWSAP>ryCYRCGNIASIMVFKDVN 

TREPKLFRAWDSERVTPPRTTTPYFL 


3611 


A 


245^ 


869 


AEmTAELREAMALAPWGPVKVKKEEEEEENF 

PGQASSQQVHSENIKVWAPVQGLQTGLDGSEEE 

EKGQNISWDMAWLKATQEAPAASTLGSYSLPG 

TLAKSEILETHGTMNFLGAETKNLQLLVPKTEIC 

EEAEKPLnSERIQKADPQGPELGEACEKGNMLK 

RQRIKREKKDFRQVIVNDCHLPESFKEEENQKCK 

KSGGKYSLNSGAVKNPKTQLGQKPFTCSVCGKG 

FSQSANLVVHQRIHTGEKPFECHECGKAFIQSAN 

LWHQRIHTGQKPYVCSKCGKAFTQSSNLTVHQ 

KIHSI^EKTFKCNECEKAFSYSSQLARHQKVHITE 

KCYECNECGKTFTRSSNLIVHQRIHTGEKPFACN 

DCGKAFTQSANLIVHQRSHTGEKPYECKECGKA 

FSCFSHLIVHQRIHTAEKPYDCSECGKAFSQLSCL 

IVHQRIHSGDLPYVCNECGKAFTCSSYLLIHQRIH 

NGEKPYTCNECGKAFRQRSSLTVHQRTHTGEKP . 

YECEKCGAAFISNSHLMRHHRTHLVE / - / • 


3612 


A 


318 


2245 


SPMAEAALVNTPQIPMVTEEFVKPSQGHVTFEDI 

AVYFSQEEWGLLDEAQRCLYHDVMLENFSLMA 

SVGCLHGIEAEEAPSEQTLSAQGVSQARTPKLGP 

SIPNAHSCEMCILVMKDILYLSEHQGTLPWQKPY 

TSVASGKWFSFGSNLQQHQNQDSGEKHmXEESS 

ALLLNSCKIPLSDNLFPCKDVEKDFPTILGLLQHQ 

TTHSRQEYAHRSRETFQQRRYKCEQVFNEKVHV 

TEHQRVHTGEKAYKRREYGKSLNSKYLFVEHQR 

THNAEKPYVCNICGKSFLHKQTLVGHQQRIHTRE 

RSYVCIECGKSLSSKYSLVEHQRTHNGEKPYVCN 

VCGKSFRHKQTFVGHQQRIHTGERPYVCMECGK 

SFIHSYDRIRHQRVHTGEGAYQCSECGKSFIYKQ 

SLLDHHRIHTGERPYECKECGKAFIHKKRLLEHQ 

RIHTGEKPYVCnCGKSFIRSSDYMRHQRIHTGER 

AYECSDCGKAFISKQTLLKHHKIHTRERPYECSE 

CGKGFYLEVKLLQHQRIHTREQLCECNECGKVF 

SHQKRLLEHQKVHTGEKPCECSECGKCFRHRTS 

LIQHQKVHSGERPYNCTACEKAFIYKNKLVEHQ 

RIHTGEKPYECGKCGKAFNKRYSLVRHQKVHIT 

EEP 


3613 


A 


817. 


3345 


NQSHPDSETVTVEGGRRKMKSNQERSNECLPPK 

KREIPATSRSSEEKAPTLPSDNHRVEGTAWLPGN 

PGGRGHGGGRHGPAGTSVELGLQQGIGLHKALS 

TGLDYSPPSAPRSVPVATTLPAAYATPQPGTPVSP 

VQYAHLPHTFQFIGSSQYSGTYASFPSQLIPPTAN 

PVTSAVASAAGATTPSQRSQLEAYSTLLANMGS 

LSQTPGHKAEQQQQQQQQQQQQQQQQQQQQQ 

QQQHQQQQQQQQQQQQQQHLSRAPGLITPGSPP 
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SEQID M 
NO: 


etbod ^ 
b 
n 
k 
c 
t( 
a 

P 
s 


rcdicted r 
cginning n 
ucleotide k 
»cation | ^ 
DrrespondiDg t< 
> first amino 1 a 
dd residue of 1 p 
eptide 1 s 
equcpce I 


redicted end A 
ucleotide ^ 
cation 1= 
:>rresponding ^ 
) last amino T 
cid residue of ^ 
eptide ^ 
equence 

' i 


mine acid sequence (A=Alm.ine C=Cyste,ne. "=ASP»"";>"''' 
Sttmic Acid. F=PhenyIalanlnc, G=Glycine, H=H.shd.ne, 

i^Se, l4roUnc, Q=GI»tamine, R=Arglmne, S=Senn., 
=Threoninerv=Va«ne, W=Tryptophaii, Y=Tyrosine, 
.=lhlknTn>=Stop codon, /-possible .«cl«.«de delebon, 
^possible nucleotide insertioii 

•AOONOYVHlSSSPQNTUK'lASPPAlFVHUii'nv^ 
^^OTI^LTLGPPSQVVMQYADSGSHFVPREA■IK 
^SRLQQMQAKEVLNGEMEKSRRYGAPSSA 

^SSaggksvphpyesrhvwhpspsdyssr 

DreG^SVMVLPNSNTPAADLEVQQAlHREAS 

S^QAQra.PWQSVASPAAAPPTLPPYFl^^ 

STVmUEDSHSPGVAVIQFAVGEHRAQVSVEVLV 
I^^S^QGWSSCCPERTSQLFDLrcSKLSVGD 
VCISLTLKI>JLKNGSVKKGQPVDPASVLLKHSKA 

dglaSSryaeqenginqgsaqmlsengei^ 

?Si^GSFLTWKPAATmWS^ESR 
KT.KKSEDEPPLTLPKPSLIPQEVKICIEGRSNVGK 








1 

] 

] 

] 


3614 


A 


3 


114 


Fre^i^CCEPRGSWAWGCWRLQPlifKFi^Q 
T FCt 


3615 


A 


3 


1603 


TEVDALRLRLEEKETMLNKKTKQIQDMAE^GT 

SgehSkdmldvkerkvnvlqkkieni^^^^ 

mSsSLKERVKSLQADTTNTDTALTTL^ 

SSSrtorlkeqrdrderekqeeidnykkdl 
SSvsllqgdlsekeaslldlkehasslass 
olkkdsrlktleialeqkkeeclkmesqlkkah 

EiASSASPEMSDRIQHLEREnRYKDESSKAQ 
J^l^^n^VENEKNDKDKKIAEl^SLTSRQ 
^mANLKHKEQVEKKKSAQMLEEARJJ 
EDNLNDSSQQLQDSLRKKDDRIEELEEALRESVQ 

SSmvlaqeesartoaekqveellmamekv 

K5SKAKisSTQQSLAEKETlJ.T^^ 

kSleevlemkqeallaaisekdaniall^^^^^ 

KKTQEBVAALKREKDIU.VQQIXQQT^imflaM 
rtP>l^BnnWPKSSHSNOTNHKPSPDQDEEEGIWA 


3616 


A 


244 


1420 


- -MRWmoSIWfLAWAEATGAY VPGRDODL 
??JJWKSALNKKEGLRLAEDRSKDPiroWm 
VEFVNSGVGDFSQPDTSPDTNGGGSTSDTQEDtt, 
ScNMVI^LPDPGPPSLAVAPEPCPg'LRSP^ 
LDNPTPFPNLGPSENPLKRLLVPGEEWEFEVTAF 

S^RQWQQTISCPEGL1U.VGSEJGDR^^ 

VTLPDPGMSLTDRGVMSYVRHVLSCIX3CMLAL 

mAGQWLWAQIU.GHCHTYWAVSEELLPNSGH 

^OTWKDKEGGVFDLGPFIVGSLGPPDLITTTE 

SSSSIWFCVGESWPQDQPW^WK 

WPTCLRALVEMARVGGASSLENTVDLHISNSHP 

LSLTSDQYKAYLQDLVEGMDFQGPGES 


3617 


A 


852 


304 


— rTu^tTMMARVLKAAAANAVOLI'SRLQAPIFI V 

Stsotl^vtgsvwnlgrlnhvaiavpdle 

vaaafvTO^ILGAOVSEAVPLPEHGVSVVFV^ 
Sl^LHPLGRDSPlAGFLQKNKAGGMHHICffi 

^SIvmdlkkkkirslseevkigahgkpvif 

1 flrpKTV-nnvi.VELEOA 


3618 


A 


3 


5992 


- -DNroETYGVNVQmDEEEGDEDVYGKVKbfaAS 
°;^^«r.«AV^^rT^.SANMyVDEILVWCASEL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A'^Alanine C^Cystdne, D=Aspartic Add, 
E=Glutamic Add, F-Phenylalanine, G=GIycine, H=Histidine, 
I=Isoleudne, K^Lysine, L=Leudne, M=Methionine, 
N^Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T«Thrconine, V=Valinc, W=Tryptophan, Y=Tyroslne, 
X=Unknown, *'=Stop codon, A^possible nudeotide deletiout 
\=possibIe nucleotide insertion 










NIPEFFPLESPHKKVGYGLSSRTWLQGGGKVIEA 
GRDLLVASGELMSSKKKDLHPRDIDAFWLQRQL 
SRFYDDAIVSQKKADEVLEILKTASDDRECENQL 
VLIXGFNTFDFIKVLRQHRMMILYCTLLASAQSE 
AEKERIMGKMEADPELSKFLYQLHETEKEDLIRE 
ERSRRERVRQSRMDTDLETMDLDQGGEALAPRQ 
VLDLEDLVFTQGSHFMANKRCQLPDGSFRRQRK 
GYEEVHVPALKPKPFGSEEQLLPVEKLPKYAQA 
GFEGFKTLNRIQSKLYRAALETDENLLLCAPTGA 
GKTNVALMCMLRmGBCHINMIXlTINVDDFKIIYI 
APMRSLVQEMVGSFGKRLATYGITVAELTGDHQ 
LCKEEISATQIIVCTPEKWDIITRKGGERTYTQLV 
RLIBLDEIHLLHDDRGPVLEALVARAIRNIEMTQE 
DVRLIGLSATLPNYEDVATFLRVDPAKGLFYFDN 
SFRPWLEQTYVGITEKKAIKRFQIMNEIVYEKIM 
EHAGKNQVLVFVHSRKETGKTARAIRDMCLEKD 
TLGLFLREGSASTEVLRTEAEQCKNLELKDLLPY 
GFAIHHAGMTRVDRTLVEDLFGDKHIQVLVSTA 
TLAWGVNLPAHTVIIKGTQVYSPEKGRWTELGA 
LDILQMLGRAGRPQYDTKGEGILITSHGELQYYL 
SLLNQQLPIESQMVSKLPDMLNAEIVLGNVQNA 
KDAVNWLGYAYLYIRMLRSPTLYGISHDDLKGD 
PLLDQRRLDLVHTAALMLDKNNLVKYDKKTGN 
FQVTELGRIASHYYITNDTVQTYNQLLKPTLSEEE 
LFRVFSLSSEFKNITVREEEKLELQKLLERVPIPVK 
ESBBEPSAKINVLLQAFISQLKLEGFALMADMVY 
'VTQSAGRLMRAite 

I^KRMWQSMCPIJRQFRKLPEEVVKKffiK^ 

FERLYDLlfflNEIGELIRMPKMGKTIHKYVHLFPK 

LELSVHLQPITRSTLKVELTITPDFQWDEKVHGSS 

EAFWILVEDVDSEVDLHHEYFLLKAKYAQDEHLI 

TFFVPVFEPLPPQYFIRVVSDRWLSCETQLPVSFR 

HLBLPEKYPPPTELLDLQPLPVSALRNSAFESLYQ 

DKFPFFNPIQTQVFNTVYNSDDNVFVGAPTGSGK 

nCAEFAILRMLLQNSEGRCVYITPMRLWQEQVY 

MBWYEKFQDRLNKKVVLLTGETSTDLKLLGKG 

NmSTPEKWDILSRRWKQRKNVQNINLFVVDEV 

HLIGGENGPVLEVICSRMRYISSQBBRPIRIVALSSS 

LSNAKDVAHWLGCSATSTFNFHPNVRPVPLELHI 

QGFMSHTQTRLLSMAKPVFHAITKHSPKKPVIVF 

VPSRKQTRLTAIDILTTCAADIQRQRFLHCTEKDL 

IPYLEKLSDSTLKETLLNGVGYLHEGLSPMERRL 

VEQLFSSGAIQVWASRSLCWGMNVAAHLVnM 

DTLYYNGKIHAYVDYPIYDVLQMVGHANRPLQ 

DDEGRCVMCQGSKKDFFKKFLYEPLPVESHLD 

HCMHDHFNAEIVTKTIENKQDAVDYLTWTFLYR 

RMTQNPNYYNLQGISHRHLSDHLSELVEQTLSDL 

EQSKCISffiDEMDVAPLNLGMIAAYYYINYTTIEL 

FSMSLNAKTKVRGLIEIISNAAEYENIPIRHHEDN 

LLRQLAQKVPHKLNNPKFNDPHVKTNLLLQAHL 

SRMQLSAELQSDTEEILSKAIRLIQACVDVLSSNG 

WLSPALAAMELAQMVTQAMWSEDSYLRRLPPF * 

PSGLFKRCTDKGVESVFDIMEMEDEERNALLQLT 

DSQIADVARFCNRYPNIELSYEWDKDSIRSGGP 

VVVLVQLEREEEVTGPVIAPLFPQKREEGWWW 
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rSEQ ED M 
NO: 


CilUlU 1 * ■ 

be 
HI 
lo 
cc 
to 
a< 
P 


cdicted P' 
•ginning 

ideotidc »o 
cation c> 
wresponding tc 
» first amino a 
cid residue of p 
eptide » 
iauencc 


redicted end A 
jdeotidc ^ 
cation 1= 
)rrcsponding N 
> last amino T 
cid residue of y 
eptide ^ 
equence 


mino acid sequence (A=Alanine C=Cysteine, D=Aspartic AciQ> 

^! 1 1 rM Lphenvlalaninc, (MSlycinc, H=Histidine, 
sGlntamic Aaa, |t«=rnenyidianiuc, w-v» j » 

=Isoleudne,K=Lysine,l^l^dne,M=Metluo^^^^^ c^erine. 
-Asnaraeine, P-ProUne, Q=«lutaminc R=Arginine. S=«erme, 
=TKfne>=Valine,W=T^^^^^^ 
=lJnknown7*=Stop codon, /possible nudeot.de deletion, 
^possible nucleotide insertion 




S( 


i h 


J 


JNIDETYGVNVQF^D^DEDVYGEVKbfcAa 
SMEGDEAVmCILSANMYVDElLVWCASEL 
SEmSPHKKVGYGLSSRTWLQGGGKVffiA 
SvASGE^MSSKKKDLHPI®ID^^^^QL 

sSyddaivsqkkadevleilktasddrecenql 

^EW-ALKPrnOSEEQU^VeCLPKY^^^^^ 

gfegfktlnriqsklyraaletdenlllcaptga 

SwMLMCJ^MIGKHINMDGTOmJDFKim 

SSeqS^Qitekkaikrfqim^ 
SSqvlvfvhsrketgktarairdmclekd 

SSSSSASTEVLRraAEQCKNLELKDLLPY 

FQVTCLGmSHYYi™)TVQTYNQlX 

L?RWSLSSEFK>mVREEEKLELQKLI^RVPffVK 

FSffiEPSAKINVLLQAFISQLKLEQFALMADMVY 

SS^CPLRQFRKLPEEWmE^P 

reRLYDLT«lNEIGELIRMPKMGKTIHKYVHLFPK 

SSvffl^QmRSTLKVELTTIPDFQWDEKVHGSS 

Sv^EDVDSEVILHHEYFLLKAKYAQDEfflJ 

SvFEPLPPQYFIRVVSDRWLSCErQLPVSre 

fflXPEKYPPPTELLDLQPLPVSALRNSAFESLYQ 

SSlQTQVFNTVYNSDDNVFVGAPTGSGK ' 

^SSSLQNSEGRCVYnTMRLWQEQVY 

IS^^^i^VVlXTGETSTDLKLLGKG 

SwEKMJILSRRWKQRKNVQNINLFVVDEV 

^£ oSGPVLEVICSRMRYISSQmRPIRWALSS^ 

SSvMimGCSATSTFN^ 

oShtotrllsmaovfhaitkhspkkpvivf 
SoSTWrrcAADigRgRFLHCTCBaJL 

^^SDSTLKETLLNGVGYLHEGLSPMERRL 
SSMQVWASRSLCWGMNVAAHLVnM 

ddSrcvmcqgskkdffkkflyeplpveshld 

SSSSTAHVTKTIENKQDAVDYLTWm^ 

SS5yWqgishrhlsdhlselveqtlsdl 
SSiSemdvaplnlgmiaavyyinyttiel 

^MsSSvRGLIEnSNAAEyENIPIRHHEDN | 


p6l9 "J 


K. 1 




5992 5 

: 
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SEQW 
NO: 



Method 



Predicted 

begianing 

nncleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 



Amino acid sequence (A-Alanine C=Cysteioe, D^Aspartic Add, 
£>=Glutamic Add, F^Phenylalaaine, G^Iycine, H^Histidine, 
I=Isoleudne, K=Lysine, LF=Leudne, M=Methionine, 
N=Asparagine, P^ProIinc, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V^Valinc, W=Tryptophan, Y«Tyrosinc, 
X^Unknown, *»Stop cofton, ^^possible nudeotide ddetion, 
\=possible nucleotide insertion 



SRMQLSAELQSDTEEILSKAIRLIQACVDVLSSNG 

WLSPALAAMELAQMVTQAMWSEDSYLRKLPPF 

PSGLFKRCTDKGVESVFDIMEMEDEERNALLQLT 

DSQIADVARFCNRYPNIELSYEWDKDSIRSGGP 

VVVLVQLEREEEVTGPVIAPLFPQKREEGWWW 

IGDAKSNSLISIKRLTLQQKAKVKLDFVAPATGG 

RHNTLYFMSDAYMGCDQEYKFSVDVKEAETDS 

DSD 



3620 



1205 



323 



VIKMALAARLLPQFLHSRSLPCGAVRLRTPAVAE 

VRLPSATLCYFCRCRLGLGAALFPRSARALAASA 

LPAQGSRWPVLSSPGLPAAFASFPACPQRSYSTE 

EKPQQHQKTKMIVLGFSNPmWRTRIKAFLIWA 

YFDKEFSITEFSEGAKQAFAHVSKLLSQCKFDLL 

EELVAKEVLHALKEKVTSLPDNHKNALAANIDEI 

\a^TGDISIYYDEKGRKFVNILMCFWYLTSANIP 

SETLRGASVFQVKLGNQNVETKQLLSASYEFQR 

EFTQGVKPDWTIARIEHSKLLE 



3621 



2995 



SSSRSRHSSISPVRLPLNSSLGAELSRKKKERAAA 

AAAAKMDGKESSYERSGSYSGRSPSPYGRRRSSS 

PFLSKRSLSRSPLPSRKSMKSRSRSPAYSRHSSSH 

SKKKRSSSRSRHSSISPVRLPLNSSLGAELSRKKK 

ERAAAAAAAKMDGKESSYERSGSYSGRSPSPYG 

RRRSSSPFLSKRSLSRSPLPSRKSMKSRSRSPAYS 

RHSSSHSKKKRSSSRSRHSSISPVRLPLNSSLGAEL 

SRKKKERAAAAAAAKMDGKESKGSPVFLPRKE 

NSSVEAKDSGLESKKLPRSVKLEKSAPDTELVNV 



gtrdskpialkeeivtpketetseketppplptiasp 

ppplptttpppqtpplpplppipalpqqpplppsqpa 

fsqvpasststlppsthsktsavssqansqppvqv 

svktqvsvtaaiphlktstlpplplppllpgdddm 

dspketlpskpvkkekeqrtrhlltdlplppelpg 

gdlsppdspepb:aitppqqpykkrpkiccpryger 

rqtesdwgkrcvdkfdnghgegtygqvykakd 

KDTGELVALKKVRLDNEKEGFPITAIREEKJLRQL 

IHRSVVlSnVIKEIVTDKQDALDFKKDKGAFYLVFE 

YMDHDLMGLLESGLVHFSEDHIKSFMKQLMEGL 

EYCHKKNFLHRDIKCSNILLNNSGQIKLADFGLA 

RLYNSEESRPYTNKVITLWYRPPKLLLGEERYTP 

AIDVWSCGCILGELFTKKPIFQANLELAQLELISR 

LCGSPCPAVWPDVnCLPYFNTMKPKKQYRRRLR 

EEFSFIPSAALDLLDHMLTLDPSKRCTAEQTLQSD 

FLKDVELSKMAPPDLPHWQDCHELWSKKRRRQ 

RQSGVVVBEPPPSKTSRKETTSGTSTEPVKNSSPA 

PPQPAPGKVESGAGDAIGLADITQQLNQSELAVL 

lnllqsqtdlsipqmaqllnihsnpemqqqleal 
nqsisalteatsqqqdsetmapeeslkeapsapvi 
lpsaeqttleasstpadmqnilavllsqlmktqe 
pagsleennsdknsgpqgprrtptmpqeeaagrs 

NGGNAL 



3622 



3623 



16 



390 



1544 



tpergsaypetaavrrpagecpitmsdleaklst 

eillgdkiki>ediklrvigqdsseihfkvmttplk 

klkksycqrqgvpvnslrflfegqriadmhtpee 

lgmeeedvievyqeqigghstv 

pppapgpdglhfegclhrlsmphqrprtcamnpe 
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SEQ ID I Method 
NO: 



Predicted 
beginniDg 
nncieotide 
locatioD 
corresponding 
to first amino 
add residue of 
peptide 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



3624 



27 



2152 



3625 



3626 



3627 



210 



1115 



921 



231 



644 



T -inn add seouence (A=Alanine C=Cysteine, u=Asparttc Acio, 1 
nrSdVFbeoyU.anine.W^^^ 1 
I=Isaleudne,K=LysiDe,l^Leuane,M=Methiomne, 
N:^DaraBiDe, p4?roline, Q=Glutamlne, R=Argimne, S=Serine, 
¥:^^?e"SCv=Valine.W=Tryp.ophan,Y=T^^^^^^ 
X=Unknown>=Stop codon, /=possilrfe nudeotide deletion, 
\Fpossible nudeotide insertion 

SffiOELLASPSPHHARRGPRGSLRGPPPPPTAHQ 

eS^^aasrsamvtsmasildggdywpe 

Sn>L™SNKCDSSPPGMGMSNITrTLmQP 

ShsS^Shhphphhhphhhhhhhhqrlsgn 

^GsJSSroGLPAMNW,ySPYKEMPGlW 

SpSatSglgglhnaqqslpnygppgitok 

S^DAffiJTAMLTRGEQHLSRGLGTPPAAM 

SnSghtqshgpvlapsrerppsss^s 

WATSGQl^INTlCEVAQRITAELKRYSIPQAff^ 

orvlSSgtlsdllrnpkpwsklksgrettrr 

S^^EFQRMSALRLAACKRKEQEPN^R 

Si^srlwtolqrrtlfaifkenkrpskemq 

SsQ^SrrVSNFFMNARRRSLEKWQDDLS 

tggssstsstctka 



S ARKAEAATSUT AARDGSVGRNLVPPPSASArK 



^EVESNEKDNRPEEEEQVIHEDDERPSEra^^ 

rkrsksedmdnvqskrrr™eeyeaefq™ 

AKGDINQKIXJKVIQWLLEEKLCALC^^^^^^ 

aelktrvekiecnkrhktvltelqakiarltcrf 

iSSoLKKRHEHPPNPPVSPGKTVNDWSM^ 

^Sgtwqmleskbkvsesappsfqtpvot 

5f^SLVTPPAVVSSQPKLQTPVTSGSLTATC^ 

SS?S™ttqvpsgnpqptislqplpviliw^^^ 

^SSQPQLLQSH^GTLNTTNQPSGNVEFISVQS^^ 

vsglSpvslpslpottkpnnvpsvpspsiqrnp 

mS/^GmAVQAWAHSIVQATRTSLPTVG 

Jsg^^?™rgpiqmkipis^^^^^^^ 
SISaSgtWqapavrqvnpqnsvtwv 

^m^VVWGLTLGSTGPQLTVHHRPP^^^^ 

Svhpaplpeapqpqrlppeagstsrpseatlev 
ghayrvkmaivlvmecpgggsklchc 



SgScfyspasqsedvilikkydqmaifhc^ 
wSggalivgW.tqhlsllcekystvv 
rrr:nnr-Tv^^^"^^T.r.TMRRSKGRA£KS 

-SSVWSALSVSMACLSPSQLgU-QQDGKLVL^u 



i^LySSlTGVWIAVEDATLENGCLWFIPG 

sotsS^S^vgsapgtsflgsepardnsl 

SvQRGM.VLIHGEVVHKSKQNLSDRSRQA 
ySnlffiASGTTWSPENWLOPTAELPFPQLYT 



^T..cT«>T»m >Ht>hLl^mER0S10>gJ<A VLM^ML 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCystcinc, ENAsparttc Add, 
EXvlutamic Add, P=Phenylalanine, G=Glycine, H=Histidine, 
I=koIeucine, K=Lysine, L=Leudne, M>=-Methionine, 
N=Asparagine, P=Proline, Q^Glutamine, R==Arginine, S^Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X==Unknown, *=Stop codon, /=posdble nudeotide deletion, 
V^possible nudeotide insertion 










NPGIFYWIFLPSRSHSASHGSRQRQVSCQGTQDEI 
LKMRNTFAELKNSLEALSSRMDQAEERIGTQAG 
VQWRDHGSLQPQPPEFKQCFHLSLPSSWDYRAC 
LS 


3628 


A 


2 


810 


GCKHLLQNSWYDPRVREADRVGQRARRPRAAM 

DWLMGKSKAKPNGKKPAAEERKAYLEPEHTKA 

RITDFOFKELVVLPREIDLNEWLASNTTTFFHHTN 

LQYSTISEFCTGETCQTMAVCNTQYYWYDERGK 

KVKCTAPQYVDFVMSSVQKLVTDEDVFPTKYG 

REFPSSFESLVRKICRHLFHVLAHIYWAHFKETLA 

LELHGHLNTLYVHFILFAREFNLLDPKETAIMDD 

LTEVLCSGGRRGSTVGAVGMGPAAGAPGAQNH 

VKER 


3629 


A 


699 


1604 


CSHGSSAVSAWSPLFQASEVERQLSMQVHALRE 

DFREKNSSTNQHIIRLESLQAEIKMLSDRKRELEH 

RT ^ATT FFMDT T OGTVFFT ODRVT TT FROGHDK!!^ 

LQLHQSQLELQEVRLSCRQLQVKVEELTEERSLQ 

SSAATSTSLLSEIEQSMEAEELEQEREQLTLLSVE 

MTALKEERDRLRVTSEDKEPKEQLQKAIRDRDE 

AIAKKNAVELELAKCRMDMMSLNSQLLDAIQQ 

KLNLSOOLEAWODDMHRVIDROLMDTHLKFRS 

X^XwX^ X.4lJ\^\^X^Xi^'^ TT V^JLa'X^XVXXXXV V XXa'XXX^JLtlTXX^ X XXuXVJuiJCViJ 

QPAAALCRGHSAGRGDEPSIAEGKRLFSFFRKI 


3630 


A 


423 


1 


pakvltldiylsktegaqvdepwitpraedcgd 
wddmekrssgrrsgrrrgsqkstdspgadaelp 
esaarddavfddevapnaasdnasaekkvkspr 
aaldggvasaaspeskpspgtkgqlrgesdrsk 
^qpppassp ' * 


3631 


A 


2082 • • 


674 ' 


WSGFWQLPGVRGVGSAPGGDGAEFTSRRGSSRR 

PGAACPGCRGAGSERAPGGMGRRRAPELYRAPF 

PLYALQVDPSTGLLIAAGGGGAAKTGIKNGVHF 

LQLELINGRLSASLLHSHDTETRATMNLALAGDI 

LAAGQDAHCQLLRFQAHQQQGNKAEKAGSKEQ 

GPRQRKGAAPAEKKCGAETQHEGLELRVENLQA 

VQTDFSSDPLQKVVCFNHDNTLLATGGTDGYVR 

VWKVPSLEKVLEFKAHEGEIEDLALGPDGKLVT 

VGRDLKASVWOKDOLVTOLHWOENGPTFSSTP 

T VJXNX^JL^XVrxiJ T TT \^XVX^V^X,^ V X V^X^XX W V^X^ll VJX X X lJij X X 

YRYQACRFGQVPDQPAGLRLFTVQIPHKRLRQPP 

PCYLTAWDGSNFLPLRTKSCGHEWSCLDVSES 

GTFLGLGTVTGSVAIYIAFSLQCLYYVREAHGIV 

VTDVAFLPEKGRGPELLGSHETALFSVAVDSRCQ 

LHLLPSRRSVPVWLLLLLCVGLIIVTILLLQSAFPG 

FL 


3632 • 


A 


942 


40 


PWCQRVEVRSCGSSKRSCSRWSGSSWDGSRSLG 

RGLNHTSLNRSPPFTPDTMTHCCSPCCQPTCCRT 

TCCRTTCWKPTTVTTCSSTPCCOPSCCVPSCCOP 

CCHPTCCQNTCCRTTCCQPTCVASCCQPSCCSTP 

CCQPTCCGSSCCGQTSCGSSCCQPICGSSCCQPCC 

HPTCYQTICFRTTCCQPTCCQPTCCRNTSCQPTCC 

GSSCCQPCCHPTCCQTICRSTCCQPSCVTRCCSTP 

CCQPTCGGSSCCSQTCNESSYCLPCCRPTCCQTT 

CYRTTCCRPSCCCSPCCVSSCCQPSCC 


3633 


A 


605 


3004 


GPEGYRGRRARHPSLGSTTGHCGGGRGAEGTGT 
DPAAPAARLNVDGLLVYFPYDYIYPEQFSYMRE 
LKRTLDAKGHGVLEMPSGTGKTVSLLALIMAYQ 
RAYPLEVTKLIYCSRTVPEIEKVIEELRKLLNFYE 
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SEQID I Method 
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Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 



Predicted end 
nacleotide 
location 
corresponding 
to last amino 
acid residue of 



acid residue of peptide 



peptide 
sequence 



3634 



3635 



3636 



3637 



sequence - 



159 



48 



384 



409 



282 



1248 



3638 



11 



3639 



630 



1200 



Amino acid sequence (A==Aianin e C^Cysteine, O^-Aspartic ACiO, 

Sto^cIdd^=Phen 

I=IsoIeuclne, K-Lysine. L-Lcncine, M^Methionine, 

N-Asparagine,P-Proline, (M^lutemine, R^rginme, S-Senne, 

T=Tlireonine,V-Valine,W=Tryptophan.Y=^^^ 

X=Unknown, *=Stop codon, T^POsaWe noclcotide deletion, 

V=possiWc nucleotide insertion 

VDGKCHSLTASYVRAQYQHDTSLPHOlFYl^ro 
AHGREVPLPAGIYNLDDLKALGRRQGWCPYFLA 
RYSILHAlsfVVVYSYHYLLDPKIADLVSKEL^ 
AVWFDEAHNIDNVCrDSMSVNLTRRTLDRCQG 

LGFLRRLLEYVKWRLRVQHWQESPPAFLSG^ 
ORVCIORKPLRFCAERLRSLLHTLEITDLADFSPL 
TLLAMFATLVSTYAKGFniIEPFDDRTPTIANPIL 
HFSCMDASIJUKPVFEI^QSVIITSGTLSPL^^^ 
ILDlOTVTMATFn^ARVCLCPMIIGRGTOyA 
ISSKFETREDIAVIRNYGNLLLEMSAVVPDGIVAF 
FTSYOYMESTVASWYEQGILENIQKNiaLFIETQ 
DGAETSVALEKYQEACENGRGAILLSVARGKVS 
EGlDFVHHYGRAVlMFGVPYVYTQSRILKARLEy 
LRDOFOIRENDFLTFDAMRHAAQCVGRAIRGKT 
DYGU^ADKRFARGDKRGKLPRWIQEHL'IDA 
NLNLTVDEGVQVAKYFLRQMAQPFHREDQLGL 
«;t t ST .P.n LESEETLKRIEQIAQQL 



LmSSKTASl- NNIAQARRlVQQLRLtiASibKxKV 

sSSlmsyceeharsdplligiptsenpfkdkk 

TCBEL 



TELSQLEKAHPPADMGRKKSKJiKPPPKKlUWiGT 

letoftcpfcnhekscdvkmdrarntgvisctv 

CLEEFOTPrrciLGNLGFFQRVGRGLESGPCSSGP 
T PAT vnrxOSRP EEQVPFSDFCGVRRCRAGFQCQ 

DHLKSCYQDSHEDPTKMmU'UJ-ll^LLVW 
IQTGLSGQNDTSQTSSPSASSSMSGGIFLFFVANAl 

IHLFCFS . . 

ARAOSVVGSAAARGPPAGCR CER^ARLlr'tiSf AK 



RRRCDWVEDGAGRMEILMTVSKFASICTMGAN 
SScEIGPEQFPVNEHYFGLVNFGNTCYCNSy 
LOALYFCRPFREKGLAYKSQPRKKESLLTCLADL 
FHSIATOKKKVGVIPPKKFrrRLRKENELFDNYM 
QQDAIffiFLNYLLNTIADILQEERKQEKQNQRLPN 
GNIDNENNNSTPDPTWVHEIFQGTLTNETRCLTC 
EI^SKDEDFLDLSVDVEQNTSnHCLRGFSOTET 
LCSEYKYYCEECRSKQEAHKRMKVKKLPMILAL 
HLKRFKYMDQLHRYTKLSYRWFPLELRLFNTS 
GDATNPDRMYDLVAWVHCGSGPNRGHYIAIV 
KSHDFWLLFDDDIVEKIDAQAIEEFYGLTSDISKN 

SESGYILFYQSRD . 

PAGIPVSTlSSDKRASTOLTRKMKPDbiPMt-UF NL 



LKEVDWSQOTATFSPMSPTHPGEGLVLl^LCTA 
DLNRGFFKVLGQLTETGWSPEQFMKSFEHMKK 

SSyv^dVgqwatatuiehkfii^^^^ 

RGRVEDVWSDECRGKQLGNLLLSTLTLLSKKL 

ncykitleclpqmvgfykkfgytvseenymcrr 



FLK 



PRVRLLRPSKSRSCRGLLSiKAPUPSPtRSLHSSPL 



lphamkspfyrcqnttsvekgnsavmggvlfst 

GL^LALGuJvRSGLGWCSRRPLRPIJgVFY 

mlvcgltvtdllgkcllspwlaayaqnrslrv 

T APAT.nNSLCOAFAFFMSFFGLSSTLQLLAMALE 
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SEQU) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Grst amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide , 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A-AJaaine C=Cysteine, tNAspartic Add, 
£=Glutamic Add, F=Pbenylalanine, G==Glydne, H-Histidine, 
I^Isoleucine, K=Lysine, L^Leucine, M-Methionine, 
N-Asparagine, FMProline, Q^GIutamine, R==Argintne, $=Serine, 
T=Tlireonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X^^Unknown^ *»Stop codon, ^possible nudeotide ddetion, 
\»possib!e nudeotide insertion 










CWLSLGHPFFYRRHITLRLGALVAPWSAFSLAF 

CALPFMGFGKFVQYCPGTWCFIQMVHEEGSLSV 

LGYSVLYSSLMALLVLATVLCNLGAMRNLYAM 

HRRLQRHPRSCTRDCAEPRADGREASPQPLEELD 

HLLLLALMTVLFTMCSLPVmiAYYGAFKDVKE 

K2^TSEEAEDLRALRFLSVISrVDPWii'WRSPVFR 

IFFHKIFIRPLRYRSRCSNSTNMESSL 


3640 


A 


930 


182 


PLPPPTLAMFLTRSEYDRGVNTFSPEGRLFQVEY 

AffiAIKLGSTAIGIQTSEGVCLAVEKRITSPLMEPS 

SIEKIVEIDAHIGCAMSGLIADAKTLIDKARVETQ 

NHWFTYNETMTVESVTQAVSNLALQFGEEDADP 

GAMSRPFGVALLFGGVDEKGPQLFHMDPSGTFV 

QCDARAIGSASEGAQSSLQEVYHKSMTLKEAIKS 

SLIILKQVMEEKLNATNIELATVQPGQNFHMFTK 

EELEEVIKDI 


3641 


A 


2 

a: 


1254 


PTGQGGRRAEARSCLLSKAMLGRSGYRALPLGD 

FDRFQQSSFGFLGSQKGCLSPERGGVGTGADVPQ 

SWPSCLCHGLISFLGFLLLLVTFPISGWFALKIVPT 

YERMIVFRLGRIRTPQGPGMVLLLPFIDSFQRVDL 

RTRAFNVPPCKLASKDGAVLSVGADVQFRIWDP 

VLSVMTVlCDLNTATRMTAQNANriXALLKRPL^ 

eiqmeklkisdqllleindvtrawglevdrvela 

veavlqppqdspagpnldstlqqlalhflggsm 

nsmaggapspgpadtvemvseveppapqvgars 

spkqplaeglltalqpflsealvsqvgacyqfnv 

vlpsgtqsayfldlttgrgrvghgvpdgipdvv 

n^maeadlrallcrelrplgaymsgrlkvkgd 

lAmamkleavlralk 


3642 


A 


1 


237 


RRGEDDMATEGDVELELETETSGPERPPEKPRKH 
DSGAADLERVTDYAEEKEIQSSNLETAMSVIGDR 
RSREQKAKQER 


3643 


A 


94 


541 


RKERRRRRRRMEAWFVFSLLDCCALIFLSVYFII 

TLSDLECDYINARSCCSKLNKWVIPELIGHTIVTV 

LLLMSLHWFIFLLNLPVATWNIYRYIMVPSGNM 

GWDPTEIH^^GQLKSHMB□EAMIKLGFHLLCFF 

MYLYSMILALIND 


3644 


A 


95 


2808 


TSCRHFPITSEDPLNYLLILTVERIYAYQALPLGFL 

FCSRDPVPEYLNHCGVKYVLISDRASFCALHIFFS 

PFRNVFRPAAGGGIAPPPRLWFQPISLSDAEMEIPK 

LLPARGTLQGGGGGGIPAGGGRVHRGPDSPAGQ 

VPTORLLLPRGPQDGGPGRRREEASTASRGPGPS 

LFAPRPHQPSGGGGGGGDDFFLVLLDPVGGDVE 

TAGSGQAAGPVLREEAEEGPGLQGGESGANPAG 

PTALGPRCLSAVPTPAPISAPGPAAAFAGTVTIHN 

QDLLLRFENGVLTLAIPPPHAWEPGAAPAQQPG 

CLIAPQAGFPHAAHPGDCPELPPDLLLAEPAEPAP 

APAPEEEAEGPAAALGPRGPLGSGPGWLYLCPE 

alcgqtfakkhqlkmhllthsssqgqrpfkcpl 
ggcgwtfttsyklkrhlqshdklrpfgcpaegc 

GKSFTTVYNLKAHMKGHEQENSFKCEVCEESFP 

TQAKLGAHQRSHFEPERPYQCAFSGCKKTFITVS 

ALFSHNRAHFREQELFSCSFPGCSKQYDKACRLK 

IHLRSHTGERPFLCDFDGCGWNFTSMSKLLRHKR 

KHDDDRRFMCPVEGCGKSFTRAEHLKGHSITHL 

STKPFVCPVAGCCARFSARSSLYIHSKKHLQDVD 
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SEQID 
NO; 



Method 



Predicted 
beginning 
nncleotide 
location 
corresponding 
to firstamino 



3645 



3646 



Predicted enfl 
nncleotide 
location 
corresponding 
to last amino 
add residue of 



addR^neof peptide 



peptide 
seoneace 



sequence 



2194 



1707 



8S 



1948 



3647 



46 



5007 



■ Amino acid sequence (A=Alan.ne C^t yste.ne. P°As^['^ Ac.d7 
E=Glutamic Acid, F=Piienylaianine, G=Glycme, H=Histid.ne, 
I=Isoleudne, K=Lysine, D=Leudne, M=Metl>.onine, 
N=Xparagine, P=ProIine. Q-Glutamine, R=Argimne, S=Senne, 
T=Tti«onine, V=Valine, W=Tryptopli8n, Y=Tyros.nc, 
l^ZZn,*^to? codon. A=possible nudeotide ddetion. 
V=possil>le nudeotide insertion 



S^SleZwsltpsseltsqrqndlsdaeivslf 

?DVPDSTsXu.LDTALVNSGILTroVASVSSTLAG 

SpaS^gqavdppslmatsdppqsldtslf 
StSfqqssli^evssvsvgplgsld^a 

Mra^PEPOALTPSSKLTVDTOTLTPSSTLCENSV 

SSakaewsvhpnsdffgqegetqfgfpnaa 

J^Swotrkhkeqcnpetopvekkirsalptct 

^Sddddsiadflnsdeeedrvsl^^^^^ 
lgesatlrslllnphlrqlmvnldqgedkaklm 

RAYMQEPLF/EFADCCI^IV^SQN]^ 

^Saaaaaaaaaa^alaasg^™^ 



SpWDDSGDDDEATIPADKSELHHTLKNLSm, 
SroiDLIAKHGAALQRSLmDGLI^S^^^ 

dITs^Saslvpkgsskvkrrvrh^ 

YSLNLWSIMKNClGRELSRIPMPVNFNEPLSMLQ 
I^raDLESLDKAVHCTSSVEQMCLVAAFSV 
I^SSSSbFNPMLGElTELDRLDDMGLRS 

SeovsiSppsaahywskhgwslwqeitis^^ 

SSIS^WSDSQGKAHYVLSGSW 
ECSKVMHSSPSSPSSDGKQKTNTSf^SAm^ 
KYPLPENAENMYYFSELALTLNEHEEGVAPTDS 

KqblSgrwdeaniekqrleekqrlsr 
rrrleacgpgsscssee 



PTGDACVSTSCELASALSHLUASHLTENLi'KAAS 

Sgoqpmteldsssdlisspgkkgaahpdpskts 

?^TCQ™RPENPSQPASPRVTKCKARSPyW.^^ 
GsipGEKAAAPPDYSKTRSASEmPmrr^^^^^ 

ysrnfssfhedstslsglgdstepslssmygdae 
dsssdpeslteaprasardgwspprsrvslhked 

^sSomiCSTRGCPNPPSSPAHLFrQAAICPAS 
^SKSmESVASPREKVACLPGSYTSGPD 
tSsSLLEMSSQEHEIHADISTSQNHBPSCAEET 
SSffiNSPLSKVAlWFHSPPnLSSPNMV 
5^I^?LDDETLNQYETS1NAAASI5SFS>^VP 

Skvlenlhisesqdlddllqkpkmiarrpim 

^S^iKHNQGTHLRSKTEKEQPL^^^ 
SSSvTVPHSPPQPKTNLENKDLSKK 

SaenSltngqkakcgpklkrlslkgkakvnse 
SSna^ggtohrkplispqtshktlskavs 

^^y^onpvTTAAPRSPOCVLESKPPLAT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nncleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A<=Alaoioe C=Cysteine, D^Aspartic Add, 
E=Glntamic Add, F=Phenylalanine, G=Glycinc, H=Histidinc, 
I-Isoleudne, K=Lysine, LF=Lcudne, M=Methionine, 
N=Asparaginc,P=FroJine, Q^GIutamlne, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptoplian, Y=*Tyrosinc, 
X^Unknown, *^top codon, A=i)ossIble nncleotide deletion, 
V=possible nudeotide insertion 










SGPLKPSVSDTSIRTFVSPLTSPKPVPEQGMWSRF 

HMAVLSEPDRGCPTTPKSPKCRAEGRAPRADSG 

PVSPAASRNGMSVAGNRQSEPRLASHVAADTAQ 

PRPTGEKGGNIMASDRLERTNQLKIVEISAEAVSE 

TVCGNKPAESDRRGGCLAQGNCQEKSEIRLYRQ 

VAESSTSHPSSLPSHASQAEQEMSRSFSMAKLAS 

SSSSLQTAIRKAEYSQGKSSLMSDSRGVPRNSIPG 

GPSGEDHLYFTPRPATRTYSMPAQFSSHFGREGH 

PPHSLGRSRDSQVPVTSSVVPEAKASRGGLPSLA 

NGQGIYSVKPLLDTSRNLPATDEGDnSVQETSCL 

VTDKIKVTRRHYCYEQNWPHESTSFFSVKQRIKS 

FENLANADRPVAKSGASPFLSVSSKPPIGRRSSGS 

IVSGSLGHPGDAAARLLRRSLSSCSENQSEAGTL 

LPQMAKSPSIMTLTISRQNPPETSSKGSDSELKKS 

LGPLGIPIFI'MTLASPVKKNKSSVRHTQPSPVSRS 

KLQELRALSMPDLDKLCSEDYSAGPSAVLFKTEL 

EITPRRSPGPPAGGVSCPEKGGNRACPGGSGPKT 

SAAETPSSASDTGEAAQDLPFRRSWSVNLDQLLV 

SAGDQQRLQSVLSSVGSKSTILTLIQEAKAQSENE 

FD VPFTVLNR K FG S G LGFS V A G GTD VEPKSTTVH 

RVFSQGAASQEGTMNRGDFLLSVNGASLAGLAH 

GNVLKVLHQAQLHKDALWIKKGMDQPRPSAR 

QEPPTANGKGLLSRKTIPLEPGIGRSVAVHDALC 

VEVLKTSAGLGLSLDGGKSSVTGDGPLVIKRVY 

KGGAAEQAGIffiAGDEILAINGKPLVGLMHFDA 

WNIMKSVPEGPVQLLIRKHRNSS 


3648-: 


A- ' 


337 -^^■^S^x^'^'^ 




KSRLSVTLMPVQLSEHPEWNESMHSLRISVGGLP 

VLASMTKAADPRFRPRWKVVLTFFVGAAILWLL 

CSHRPAPGRPPTHNAHNWRLGQAPANWYNDTY 

PLSPPQRTPAGIRYRIAVIADLDTESRAQEENTWF 

TYLKKGYLTFSDSGDKVAVEWDKDHGVLESHL 

AEKGRGMELSDLIVFNGKLYSVDDRTGVVYQIE 

GSKAVPWVTLSDGDGTVEKGFKAEWLAVKDER 

LYVGGLGKEWTTTTGDVVNENPEWVKVVGYK 

GSVDHENWVSNYNALRAAAGIQPPGYLIHESAC 

WSDTLQRWFFLPRRASQERYSEKDDERKGANLL 

LSASPDFGDIAVSHVGAVVPTHGFSSFKFIPNTDD 

QnVALKSEEDSGRVASYIMAFTLDGRFLLPETKI 

GSVKYEGIEFI 


3649 


A 


1 


775 


PTRPGSGSAGGARVGSGEFGVEMAALAPLPPLPA 

OFKSIOHHLRTAOEHDKRDPVVAYYCRLYAMO 

TGMKIDSKTPECRKFLSKLMDQLEALKKQLGDN 

EAITQEIVGCAHLENYALKMFLYADNEDRAGRF 

HKmilKSFYTASLLmVITWGELTDENV^^ 

ARWKATYIHNCLKNGETPQAGPVGIEEDNDIEEN 

EDAGAASLPTQPTQPSSSSTYDPSNMPSGNYTGI 

QIPPGAHAPANTPAEVPHSTGVAK 


3650 


A 


20 


963 


KMAATLGPLGSWQQWRRCLSARDGSRRLLLLL 

LLGSGQGPQQVGAGQTFEYLKREHSLSKPYQGE 

APRPCFLRDWELQVHFKmGQGKKNLHGDGLAI 

WYTKDRMQPGPVFGNMDKFVGLGVFVDTYPNE 

EKQQERVFPYISAMVNNGSLSYDHERDGRPTEL 

GGCTAI\^l^«.HYDTFLVIRYVKRHLTIMMDro 

HEWRDCIEVPGVRLPRGYYFGTSSITGDLSD>JHD 

VISLKLFELTVERTPEEEKLHRDVFU>SVDNMKL 
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SEQID 
lO: 



3651 



Method 



Predicted 
beginning 
Doeleoiide 
location 
corresponding 
to first amino 



Predicted ena 
nncleotide 
location 
corresponding 
to last amino 
acid residue of 



acid residue of peptide 
peptide \ sequence 

seifuence 



1218 



Amino acid sequence (A=Alanine C =Cysteine. u=Asparnc acU, 
^tarmicAcid,F=kenylalanine,G=«Iyclne,H=Hi3Ud.ne, 
I=IsoIeucine,K=Lysine,L=Lencine,M=*fethioniBe, 
NSrgkl4roline.QM51utamtac,R=A^^^^^^^ 
T=Tlireonine. V=VaIinc W=Tryplophan, Y='15josln'. 
X=UDlaiown, *=Stop codon.^posriWe nucleotide deletwn. 
V=possible nucleotide insertion 



PEMTAPLPPLSGL ALFLIVFt--SLVl-SVFAIVlCillLV 

NKWOEQSRKRFY 

' RSWAYVKKCKNNMCPNRUL HDGPEt'CWLMHA 



3652 



AGTVSAVQARGLQPSQSRSRPRVPGLATALAYG 
PAHTPPLSRIGWAMQPPPPGPLGDCLRDWEDLQ 
QOTQNIQVSAAADAGSPPSRVSLAQGQGSGSPGC 
KPSLPAEAEGAAQELENQMKERQGLFFDNffiAYL 
PKKNGLYLSLVLGNVNVTLLSKQAKFAYKDEYE 
KFKLYLTIILILISFTCRFLLNSRVTDAAFNFLLW 
YYCILTOESILINNGSRIKGWWVFHHYVSTFLSG 
VMLTWPDGLMYQKFRNQFLSFSMYQSFVQFLQ 
Y^QSGSWliiLGERHmDLTVEGFQSW 
R\a.TFLLPFLFFGHFWQU^ALTlJW.AQDPQCX 
EWLMCGFPFLLLFLGl«nTTrLRWH^ 
RHQSKKD 



640 



164 



3653 



3654 



909 



3655 



909 



VTrSCnPFAFGLGVRASERLAElDMP ^ LLKYQPM 
MQTIGQKYCMDPAVIAGNTLSRKSPGDKILWMG 
DRTSMVQDPGSQAPTSWISESQVFQTTEVLTTRI 
moSrWTPDQYLRGGLCAySGGAGYVRSS 
QDLSCDFCNDVLARAKYLKRHGF 



TVRRDWOEVSDIHLAMANCKMTKSlKl<rAUiiH<^ 
YTGGEVVLPKDQEEWKRRTGLLLYENYGQSETG 
LICATYWGMKIKPGFMGKATPPYDVQFHMEASV 
ENCnVSMNTADPGSQGITHSLLLQVIDDKGSILPP 
>rrEGl«nGIRIKPVRPVSLFMCYEGDPEKTAKVEC 
GDFYNTGDRGK]V!DEEGYICFLQRSDDIINASGYR 
IGPAEVESALVEHPAVAESAVVGSPDPIRGEVVK 

afwItpqflshdkdqltkelqqhvksvtapyky 

pyYy^^r<iVJ PKTTTGKIERKELRKKETGQM 



IVRRDWQEVSDIHLAMANCKMTKSlRfPALtiriC 



2364 



YTGGEVVLPKDQEEWKRRTGLLLYENYGQSETG 
iSASwGMKra>GFMGKATPPYDVQFHMEASV 
ENCIIVSMNTADPGSQGITHSLLLQVIDDKGS]LPP 
NTEGNIGnUKPVRPVSLFMCYEGDPEKTAKVEC 
GDFYNTGDRGKMDEEGYICFLGRSDDIINASGYR 

igpaevesalvehpavaesawgspdpirge™ 

AHVLTPQFLSHDKDQLTKELQQHVKSVTAPYKY 
ppy^^w<;Tn pictttGKIERKELRKKETGQM 



SPGPSLPESAEHLDGSQEDKPKGSCAEFit-lUlG 



MVAHlNNSRLKAKGVGQHDNAQNFGNQSraEL 
lUVASkGELFEDPLFPAEPSSLGFKDLQPNSK^^ 
VONISWORPKDIINNPLFIMDGISPTDICQGILGDC 
mLAAIGSLTTCPKLLYRVVPRGQSFKKNYAGIF 
KrWQFGQWVNVVVDDRLPTKNDKLVFVHST 

SsefwsaiIekayaklsgsyealsggstmegl 

EDFTGGVAQSFQLQRPPQNLLRLLRKAVE^SL 

mgcsievtsdselesmtdkmlvrghaysvtglq 
dyetyrgkmetlirvrnpwgriewngawsdsar 
eweevasdiqmqllhktedgefwmsyqdfl^ 
fS^tpd-ilsgdyksywhttfyegswrtg 

SSAGGCRNHPGTTWTNPQFKISLPEGDDPEDDAE 

gnvwctclvalmqkkwrharqqgaqlqtigf 

VLYAVPKEFQNIQDVHLKKEFFTK.YQDHGFSEIF 
TNSREVSSQLRLPPGEYIlIPSTFEPHRDADFLl*y 
^HSRSWELDEVNYAEQLQEEKVSEDDMDQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteinc, I>=Aspartic Acid, 
£=<;iutaraic Acid, P=Plienylalanlne, G=Glycine, H-Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, M=Metfalonine, 
N^Asparagtne, P=Proline, Q^lotamine, R=Arginine, S^erinc, 
T^Tbreonine, V==ValiDc, W=Tryptophan, Y'^Tyrosinc, 
X=UnkDown, *^top codon, Aspossible nucleotide deletion, 
\«possible nucleotide insertion 










DFLHLFKWAGEGmGVYELQRLLNRMAIKFKS 

FKTKGFGLDACRCMINLMDBCDGSGKLGLLEFKI 

LWKKLKKWMDIFRECDQDHSGTLNSYEMRLVIE 

KAGIKLNNKVMQVLVARYADDDLIIDFDSFISCF 

LRLKTMFITFLTMDPKNTGfflCLSLEQVLGEGW 

EGICRIAPACPSTPPPPSSDVPGPASCPRLFPPWDL 

LPVSTVAADDHVGffiAL 


3656 


A 


3 


174 


PLCTHYLLPELPEKSSRTSPRSRPGNMLSGDPHLP 
QPLCHCLDHCPCCFSGKRLVA 


3657 


A 


1 


444 


DTRSTYHNAHSLPTYVKSPAPCQMTYIKSPAPCQ 

TQTCYVQGASPCQSYYVQAPASGSTSQYCVTDP 

CSAPCSTSYCCLAPRTFGVSPLRRWIQRPQNCNT 

GSSGCCENSGSSGCCGSGGCGCSCGCGSSGCCCL 

GIIPMKSRSPALL 


3658 

. - 


A 


92 


1537 


SEAPVQPQPYTMTSFYSTSSCPLGCTMAPGARNV 

FVSPIDVGCQPVAEANAASMCLLANVAHANRVR 

VGSTPLGRPSLCLPPTSHTACPLPGTCHIPGNIGIC 

GAYGK]Srn.NGHEKETMKFLNDRLANYLEKVRQ 

LEQENAELETTLLERSKCHESTVCPDYQSYFRTIE 

ELQQKILCSKAENARLIVQIDNAKLAADDFRIKL 

ESERSLHQLVEADKCGTQKLLDDATLAKADLEA 

QQESLKEEQLSLKSNHEQEVKILRSQLGEKFRIEL 

DIEPTDDLNRVLGEMRAQYEAMVETNHQDVEQ 

WFQAQSEGISLQAMSCSEELQCCQSEILELRCTV 

NALEVERQAQHTLKDCLQNSLCEAEDRYGTELA 

QMQSLISNLEEQLSEIRADLERQNQEYQVLLDVK 

ARLiBNEIAtYRNLTPLQSLFH^Q^^ 

HRWSLWPWSQHGEMILKARVnElRLRLV 

VPSPCPVFLQD 


3659 


A 


2 


402 


DLLQCLNQLYSASTEMSCQQSQQQCQPPPKCTP 

KCPPKCIPKCPPKCPPKCPPQYSAPCPPPVSSCCG 

SSSGGCCSSEGGGCCLSHHRPRQSLRRRPQSSSC 

CGSGSGQQSGGSSCCHSSGGSGCCHSSGGCC 


3660 


A 


26 


710 


CSAVEVKMAARTAFGAVCRRLWQGLGNFSVNT 

SKGNTAKKGGLLLSTOMKWVQFSNLHVDVPKD 

LTKPVVTISDEPDILYKRLSVLVKGHDKAVLDSY 

EYFAVLAAKELGISIKVHEPPRKIERFTLLQSVHI 

YKKHRVQYEMRTLYRCLELEHLTGSTADVYLEY 

IQRM.PEGVAMEVTKFCFFIFLDTIRTVTRTHQGA 

NLGNTIRRKRRKQVIKPQGGHFCLNLK 


3661 


A 


2 


370 


DVSVAASEPTVYRNPTKMSCQQNQQQCQPPPKC 
PPKYPPKCPSKCASSCPPPISSCCGSSSGGCCSSG 
GCGCCSSEGGGCCLSHHRHHRSHCHRPKSSNCY 
GSGSGQQSGGSGCCSGGGCC 


3662 


A 


205 


1277 


RKSLPHPNPQKMLKKPLSAVTWLCEFIVAFVSHP 

AWLQKLSKHKTPAQPQLKAANCCEEVKELKAQ 

VANLSSLLSELNKKQERDWVSWMQVMELESN 

SKRMESRLTDAESKYSEMNNQBDIMQLQAAQTV 

TQTSAGKETSPLRERGVPPHLQHCFYIPPDDFLGS 

PELEVFCDMETSGGGWTHQRRKSGLVSFYRDW 

KQYKQGFGSIRGDFWLGNEHIHRLSRQPTRLRVE 

MEDWEGNLRYAEYSHFVLGNELNSYRLFLGNY 

TGNVGNDALQYHNNTAFSTKDKDNDNCLDKCA 

QIJIKGGYWYNCCTDSNLNGVYYRLGEHNKHLD 

GITWYGWHGSTYSLKRVEMKIRPEDFKP 
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SE\2 1X1 It 

NO: 


[ethod P 
b 
n 
k 

0 

t 

a 
1 


redicted P 
eglnnlng ^ 
udeotide k 
>catioD c 
omsponding ti 
D first amino a 
dd residue of F 
peptide s 


redicted end a 
udeotide E 
tcation I' 
9rresponding ^ 
3 last amino 1 
dd residne of > 
eptide V 
equence 


mino add sequence (A=AIanmc C-Cystdne, u=Aspartic Acid, 
mmoaciuscij ^_ih-_-,t_ia„:„e G=Glvdne, H=Histid ne, 
r=Glutamic Acid, F-r ncnyiaianmc, vr— uriy XA 

=Isoleuclne, K=Lysine, b=Leucine, M=Methionine, 
NAsparagine, P=ProUne, Q=Glutaniine. R=Arginliie, S=Serine, 
=Threoniiie, V=VaUne, W=Tryptophai., Y-Tyrosine, 
:=UnkaowB, *=Stop codon, /-possible Dudeobde deleboo, 
^possible nucleotide insertion 


3663 ^ 


V . t 


54 1 


L456 J 

] 


SSAKETLAQMV NTVWNMBDLDLEYAKTUINC 
iTDLMFmMDPPALPPKPPKPTTVANNGMNNN 
VISLQDAEWYWGDISREEVKEKLRDTADGTFLV 
RDASTKMHGDYTLTLRKGGNNKLIKIFHRDGKY 
GFSDPLTFSSWVEIJMIYRNESLAQYNPKLDVKL 
LYPVSKYQQDQWKEDNIEAVGKKLHEYNTQFQ 
EKSREYDRlWYTOTSQEIQl^TAIEAFNEm 
IFEEOCOTQERYSKEYIEKFKKEGNEKEIQRIMHN 
YDKLKSRISEIIDSRRRLEEDLKKQAAEYREIDKR 
MNSIKPDLIQLRKraDQYLMWLTQKGVRQKKL 

NEWLGNEOTEIXJYSLVEDDEDLPHro^ 
GSSNRMCAENLLRGKRlXimVRESSKQGC^^^^^ 
SVWDGEVKHCVINKTATGYGFAEPYNLYSSLK 
PT VT KYOHTSLVOHNDSLNVTLAYPVYAQQRR 


3664 


A 


944 


406 


GATVEDOSCNFGSLRWVVSVPHISARSCl'JJl'Ll.5 
RTGRVPGGRGAGLPRHHSPRCCLQVFFNGANVR 
OVDVFILTGAFGILAAHVPTLQVLRPGLWVHA 
EDGTTSKYFVSSGSIAVNADSSVQLLAEEAVTLD 
MLDLGAAKANLEKAQAELVGTADEATRAEIQIR 


3665 


A 


98 


1388 


ASQLAFGGKLTSTPSRDFQGCOKGAVTCCSfimM 
RHOSGRCLSTGMAPNLKGRPRKKKPCPQRRDSF 
SGVKDSNNNS1X3KAVAKVKCEARSAU1XPK1W 
HNCKKVSNEEKPKVAIGEECRADEQAFLVALYK 
YMKERKTPIERIPYLGFKQINLWTMFQAAQKLG 
GYETITARRQWKHIYDELGGNPGSTSAATCTRR 
.ftYltapS^RFtoEDKPLPPIKPRKQE^^^^^ 
!te>)kfkVSQfKRIKHEIPKSKKEKENAPKPQDAA 
EVSSEQEKEQETLISQKSIPEPLPAADMKKKIEGY 
OEFSAKPLASRVDPEKDNETDQGSNSEKVAEEA 
GmCGPTPPLPSAPLAPEKDSALVPGASKQPLTSPS 
ALVDSKQESKLCCFTESPESEPQEASFPRLPHHTG 
TTOWQTRMRRKMTOCPPWQITLPTAP 


3666 


A 


113 


1492 


■ TXOEMCnCTIPVLWGCFLLWNLYVSSSQTlYi'Ui 
KAJRITORALDYGVQAGMKMffiQMLKEKKLPDL 
SGSESLEFLKVDYVNYNFSNIKISAFSFPNTSLAF 
VPGVGKALTOHQTANISTOWGFESPLFVLYl^F 
AEPMEKPIlJCNLNEMLCPIlASEVKALNAmSTLE 
VLTKJDNYTLLDYSLISSPEIIENYLDLNLKGVFY 
PLENLTDPPFSPVPFV1J»ERSNSMLYIGIAEYFFKS 
ASFAHFrAGVFNVTLSmiSNHFVQNSQa,GNV 
LSRIAEIYDoSQPFMVRIMATEPPIINLQPGNFTLDI 

PASIMMLTQPKNSTVETIVS>^FyAST^^^^^^ 
GQRLVCSLSLNRITU.ALPES>mSMEVLRFENlLSS 

S^GVIJ^AKLQQGFPLPOTHKFLFyNSDl^^^ 
LEGFLLISTDLKYETSSKQQPSFHVWEGLNLISRQ 

wrgksap 


3667 


A 


1 


181 


- -mOTLGSGRNGGGSMNAPPAl-bSFLLFEGEKll IN 
TirnTKVPNACLFTINKEDHTLGNnK 


366S 


A 


212 


431 


- -VaGEAVPFFPMMV SEPLKPSYLALVLW VFLLIO 
YCnKPEVIFKIEQQEEPWILEKGFPSQCHPAKYL 


3669 


A 


458 


1056 


FSGVCFAGIAGSMATLLHDAVMNFABV VJ^V^^y 
JJwSQHRSAISCIRTVWRTEGLGAFYRS^TTQLT 
SFOSIHFriY^^ n^^V^HRTYNPQSHnSGG | 
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S£QID 
NO: 


Method 


Predicted 

beginning 

nudeotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A'=Alanine C=Cysteine, D^Aspartic Acid, 
£==GIatamic Add, F^Phenylalanlne, G^lydne, H^HIstidine, 
I=Isoleudne, K==Lysine, L=Leudne, M'^Methionine, 
N^Asparagine, P=Proline, Q=Glutamine, R^Arginlne, S=Serine, 
T^Threoninc, V=Valinc, W=Tryptophan, Y^Tyrosinc, 
X^Unknown, *=Stop codon, A^possible nucleotide deletion, 
V=possibIe nucleotide insertion 










LAGALAAAATTPLDVCKTLLNTQENVALSLANIS 
GRLSGMANAFRTVYQLNGLAGYFKGIQARVIYQ 
MPSTAISWSVYEFFKYFLTKRQLENRAPY 


3670 


A 


145 


298 


RNPCPLTFLPSTLMVLLLSLTFFSALTFHSICQLRN 
TGVEVDIVFQRVSFL 


3671 


A 


3 


462 


ELKVAKKERTMSSLPVPYKLPVSLSVGSCVnKGT 

PIHSFINDPQLQVDFYTDMDEDSDIAFRFRVHFG 

NHVVM>mREFGIWMLEETTDYVPFEDGKQFELC 

lYVHYNEYEIKVNGHTHLRALSHRIPPSFVEDGC 

KCPRRYLPWTSVCVCN 


3672 


A 


1 


1028 


HYAKLGTRPRLKFMSSPSLSDLGKREPAAAADE 

RGTQQRRACANATWNSIHNGVIAVFQRKGLPDQ 

ELFSLNEGVRQLLKTELGSFFTEYLQNQLLTKGM • 

VILRDKIRFYEGQKLLDSLAETWDFFFSDVLPML 

OAIFYPVOGKEPSVROLALLHFRNAITLSVKLED 

ALARAHARVPPAIVQMLLVLQGVHESRGVTEDY 

LRLETLVQKWSPYLGTYGLHSSEGPFTHSCILEK 

RLLRRSRSGDVLAKNPVVRSKSYNTPLLNPVQE 

HEAEGAAAGGTSIRRHSVSEMTSCPEPQGFSDPP 

GQGPTGlTRSSPAPHSGPCPSRLYPrrQPPEQGLD 

PTRS 


3673 


A 


2 


712 


RPPRVWYPELRELSAAAPRWSHRTAPGI^4VFYF 
TSSSVNSSAYTIYMGKDKYENEDLIKHGWPEDI 
WFHVDKLSSAHVYLRLHKGENIEDIPKEVLMDC 
APn.VKANSIQGCKMNNVNVVYTPWSNLKKTAD 
MDVGQIGFHRQKDVKIVTVEKKVNEILNRLEKT 
' ItVERFPIiLXAEKECRDiaEERNEK^ 
EKEEMKKKREMDEUISYSSLMKVENMSSNQDG 
NDSDEFM 


3674 


A 


2 


712 


RPPRVWYPELRELSAAAPRWSHRTAPGIMVFYF 

TSSSVNSSAYTIYMGKDKYENEDLIKHGWPEDI 

WFHVDKLSSAHVYLRLHKGENIEDIPKEVLMDC 

AHLVKANSIQGCKM>n^VNVVYTPWSNLKKTAD 

MDVGQIGFHRQKDViaVTVEKKVNEILNRLEKT 

KVERFPDLAAEKECRDREERNEKKAQIQEMKKR 

EKEEMKKKREMDELRSYSSLMKVENMSSNQDG 

NDSDEFM 


3675 


A 


921 


1321 


VTLAKMRVfflSSCLKVQEQMANCPKFVPWPTS 
QPIPSNIPNRSTFACPYCGARNLDQQELVKHCVE 
SHRSDPNRVVCPICSAMPWGDPSYKSANFLQHL 
LHRHKFSYDTFVDYSIDEEAAFQAALALSLSEN 


3676 


A 


3 


1856 


TLGRWLLGVYETVAPTLACLPRPRLRRRRRRRR 

RRMISRYIMCAWQSLELKGITKHALNHHPPPEK 

LEEISPTSDSHEKDTSSQSKSDITRESSFTSADTGN 

SLSAFPSYTGAGISTEGSSDFSWGYGELDQNATE 

KVQTMFTAIDELLYEQKLSVHTKSLQEECQQWT 

ASFPHLRJLGRQHTPSEGYRLYPRSPSAVSASYET 

TLSQERDSTIFGIRGKKLHFSSSYAHKASSIAKSSS 

FCSMERDEEDSnVSEGIIEEYLAFDHIDIEEGFHG 

KKSEAATEKQKLGYPPIAPFYCMKEDVLAYVFD 

SVWCKWSCMEQLTRSHWEGFASDDESNVAVT 

RPDSESSCVLSELHPLVLPRVPQSKVLYTTSNPMS 

LCQASRHQPNVNDLLVHGMPLQPRNLSLMDKLL 

DLDDKLLMRPGSSmSTRNWPNRAVEFSTSSLS 

YTVQSTTIRRI^PPRTLHPISTSHSCAETPRSVEEIL 
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SEQJD 
NO: 



Method 



3677 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid readne of 
peptide 
uence 



3678 



3679. ; 



Predicted end 
nndeolide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



246 



757 



20 



1508 



3680 



249 



2146 



Amino acid sequence (A=Alanme C=Cysteine, l»=;fP»™'J 
E=Glutamic Acid, phenylalanine, G-GlyciBe, H-Hstldine, 
I=.IsolendDe,K=L¥siBe,I/=Lendne,M=MetliioBlne,- 
N=SSe, P=ProUne, Q=GI»tamine, ^Argtatae. S=Sertoe. 
T=Threonine,V=Valine,W=Tryptophan,V=aVroMiie. 
X=Unlinown, *=Btop codon. ^-possible nucleotide ddebon, 
V^possible nucleotide insertion 

RGARVPVAPUSLSSPSPTPLSKNNLLPPlGTAiiVli 
HVSTVGPQRQMKPHGDSSRAQSAWDEPNYQQ 
POERLLLPDFFPRPNTTQSFLLDTQYRRSCAVEYP 
HQARPGRGSAGPQLHGSTKSQSGGRPVSRmQG 



Ml?I OGAIFVLlJHLGPlLVWLFTRDHMSljWCiiG 
Ss^SnivLLLVQTAIYSWGYASYLVm 
DLGGGLGWPLALPLGLYAVQLTISWTVLVLm: 
VHNPGIALLHLLLLYGLWSTAIJWHPINKLAAL 

uTpVLAWLIVTSALTYa 
EKSD 



RGKAEFFUVMAGTOALLMLKNFiDGOLPCSSYi 
DSYDPSTGEVYCRVPNSGKDEIEAAVKAAREAFP 
SWSSRSPQERSRVLNQVADLLEQSLEEFAQAESK 
DQGKTLALARTMDIPRSVQMFRFFASSSLinirSE 
CTQMDHLGCMHYTVRAPVGVAGLISPWNLPLY 
LLTWKIAPAMAAGNTVIAKPSELTSVTAWI^CK 
LLDKAGVPPGWNIVFGTGPRVGEALVSHPEVPL 
TSFTGSOFTAERITQLSAPHCKKLSLELGGKNPAn 
FEDANLDEOTATVRSSFANQGEICLCTSRFVQK 
SIYSEFLKRFVEATRKWKVGIPSDPLVSIGALISK 
AHLEKVRSYVKRALAEGAQIWCGEGVDKLSLPA 
RNQAGYFMLPTVITDIKDESCCMTCEIFGPVTCV 
WFDSEEEVIERANNVKYGLAATVWSSNVGRVH 
RVAKKLQSGLVWTNCWLIRELNLPFGGMKSSGI 
rxRROAKDSYDFFTEIKTITVKH 



MAmTSFYMEIQTTIREYYliHLi^Al^KLbWLEEMD 
KFLDTYTtPRLNQEEVESLNRPITGSEIEAIINSLP 
TKKIPGPDRFTAKFYQRYKEELSNLlHyLGLSHH 
LU^miVSFGKKSAWSSAQVKVTOTOroGVEV 
RVFEGPPKPEEPLKRSWYfflGGGWALASAKIRY 
YDELCTAMAEELNAVIVSIEYRLVPKVYFPEQIH 
DWRATKYFLKPEVLQKYMVDPGRICISGDSAG 
GNLAAALGQQFTQDASLKNKLKLQALIYPVLQA 

ldfntpsywWtpilpryvmvkywvdyfkg 

NYDFVQAMIVNNHTSLDVEEAAAVRARLNWTS 
LLPASFTKNYKPWQ1TGNARIVQELPQLLDARS 
APLIADQAVLQLLPKTYILTCEHDVLRDDGIMYA 
raLESAGVEVTLDHFEDGFHGCMIFTSWPTNFSV 

GIR1RNSY1KWLDQNL 



RSWGAPWFWKMRLLRRKHMFLRLAMVGUAFV 
LFLFLLHRDVSSREEATEKPWLKSLVSRKDHVLD 

lEaSlrdsmpklqirapeaqqtlfs^^^^ 

LPGFYTPAELKPFWERPPQDPNAPGADGKAFQK 
SKWLETQEKEEGYKKHCFNAFASDRISLQRSL 
CTDTRPPECVDQKFRRCPPIATOXTIVFHNEAW^ 
TLLRTVYSVLHTTPAILLKEnLVDDASTEEHLKE 

VVSPDIVTIDLNTFEFAKPVQRGRVHSRGNFDWS 
LTFGWETLPPHEKQRRKDETYPIKSPTFAGGLFSl 
SKSYFEHIGTYDNQMEIWGGENVEMSFRVWQC 
^^Era^SWGH^-rcSPHlTPKGTSVL^Q 

vrSevwmdsykkifyrrnlqaakmaqeksfg 

^y>rr„ WPHKFSWYLHNVYPEMFVPDL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to First amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystetne, D^Aspartic Acid, 
E=Glntamic Acid, F^Phenylalanine, G=Glycine, H=Histidine, 
I^Isoleocine, K=Lysine, Ir^Leucine, M=Metfaioniae, 
N=Asparaginc, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V-Valine, W=Tryptophan, y=Tyrosinc 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertioD 










TPTFYGAIKNLGTNQCLDVGENNRGGKPLIMYS 
CHGLGGNQYFEYTTQRDLRHNIAKQLCLHVSKG 
ALGLGSCHFTGKNSQVPKDEEWELAQDQLIRNS 
GSGTCLTSQDKKPAMAPCNPSDPHQLWLFV 


3681 


A 


2982 


1869 


LKBTLKSQMTQEASDEAEDMKEAMNRMIDELN 

KQVSELSQLYKEAQAELEDYRKRKSLEDVTAEY 

IHKAEHEKLMQLTm^SRAKAEDALSEMKSQYSK 

VLNELTQLKQLVDAQKENSVSITEHLQVITTLRT 

AAKEMEEKISNLKEHLASKEVEVAKLEKQLLEE 

ESVKEKEKVHSEVVQIRSEVSQVKREKENIQTLL 
KSKEQEVNELLQKFQQAQEELAEMKRYSESSSK 
LEEDKDKKINEMSKEVTKLKEALNSLSQLSYSTS 
SSKRQSQQLEALQQQVKQLQNQLAECKKQHQE 
VISVYRMHLLYAVQGQMDEDVQKVLKQILTMC 


3682 


A 


447 


1024 


AQALTAGRQLALAAPFIAPISPISLPRLNPPSQSW 

NSTPFFKVKLPPQKEVTTSDELMAHLGNCLLSIKP 

QEKSEGLQLNFQQNVDDAMTVLPKLATGLDVN 

VRFTGVSDFEYTPECSVFDLLGIPLYHGWLVDPQ 

QSPEAVRAVGKLSYNQI/VGEDHHLQTLQ*HQP 

RDRKPDCRAVPGDHRGPSDLPRTV 


3683 


A 


2 


942 • 


LEIKQEEKFVGQCnCEELMHGECVKEEKDFLKKE 
IVDDTK\aCEEPPINHPVGCKRKLAMSRCETCGTE 
EAKYRCPRCMRYSCSLPCVKKHKAELTCNGVRD 

DAFLkRPISNKYMYFMKmAliRQGIN^ 

FTKRKENSTFFDKKKQQFCWHVXLQFPQSQA\ST 

*KKRVPDDKTINEILKPYIDPEKSDPVIRQRLKAYI 

RSQTGVQn.MKIEYMQQ>ILVRYYELDPYKSLLD 

NLRNKVIIEYPTLHVVLKGSNNDMKVLHQVKSE 

STKNVGNEN 


3684 


A 


119 


1533 


SLQENVQEKRVRVCPGLGGLLPNGTPSITAAAAP 

QVLWRHVQPGCSHHLHACVIRAACRAGEGHAD 

RHAGPPET/PVTLPSSWPWSSPWERQCPMH\L*AP 

GHAFRPVPTEHRRGWAALGHHRAAAGPLREPAS 

GSQPAPASC*PECHHGCPEQTRQCQDLLREAW 

APEQRG*PCAHLQT*ATATTLCPQVPAGRVWQP 

GHSCHLLPHRHDGSH*HHCAAHRRPVTRRQAAH 

GVPLPDACYSPHHTLPAAPPPATRPAGHTATHPE 

♦GGDT .TPVPDGPKDrPRDVOGTPfi AfiGGSOT A PC 

CPPFPAAPVSVQGTQGLGPKNVLH*QWEGIRWQ 

KEPE/PGPPPEVELKRGAKCRIGDHGLGAVLGQG 

EYAS*SPSIPW*ASSSACPPLHPTP/TVYTQSPAAA 

PGWTRPPSP/PPPGLYPGP/PASHAPGVRGGISHOL 

X \J TT X XXX X t^X 1 X A M, ViJX_/ X X \JX IX iX^iJX XXXX \J T XvvJ VJ XkJX XVXl^ 

YSLP*LCRECCSCP/PPPPAHGGRCPSLLPPEALAK 
LLL 


3685 


A 


101 


438 


AWVLQCKINTELQTEVVMLKSMVLWLGEQVQS 
LQLQQQLHCHFNHTHICVTNLEYN\KEYPWDLV 
KAHLQGASTSNITFDiGELQKKMLDLNKQTQEFQ 
PSL*AWTEFQQGLE 


3686 


A 


105 


845 


VSDWKNQLVEVQCRQDGCDAVENVHQMFMF 
NWFTDCLWILFLSNYQPSYESSSPGGSATSDDHE 
FDPSADMLVHDFDDERTLEEEEMMEGETNFSSEI 
BDLAREGDMPIHELLSLYGYGSTVRLPEEDEEEE 
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mino add sequence (A=Alanine C=Cystcine, D=Aspartic Acid, 

^ t k !• J v—TihotiviQianiiK^ G=GIvcine« H=HistiQine, 
ssGltttamIc Acid,F=Pnenyiaianine, \ji ^nj^-iut, i* * j 

==Isoleudne,K-Lysine,I^Uudne,M=Mcthwnine, 

NAsparagine, P-Proliae, Q=Glutamme, R=Arginine, S-^enne, 

'^Threonine, V=Valine, W=Tryptophan, Y=Tyroslnc, 

:=Unkno>vn, *=Stop codon, /=posablc nndeotide deletion, 

^possible nucleotide insertion 










J 

] 


JEEEEGEDDHDADNODNSGCSCJiiWKliliNiiUJsa 
3QEDETQSSNDDPSQSVASQDAQEIIRPRRCK.YF 
DTNSEVEEESEEDEDYBP/SnSFFQSSDGPSSSSSE 




5687 




49 


1225 


PVLVTSLRMREADTLRPPQLMEVSADUS 1 Vti-N 

HTGELLATGDKGGRWIFQREPESKNAPHSQGE 

YDVYSTFQSHEPEFDYLKSLEIEEKINKIKWLPQQ 

NAAHSLLSTNDKTIKLWKITEM)KRPEGmJ^ 

EGKLKDLSTVTSLQVPVLKPMDLMVEVSPRRIFA 

NGHTYHINSISVNSDCETYMSADDLRINLWHLAI 

TDRSFIPVNWDIKPANMEDLTEVITASEFHPHHC^ 

ilLFVYSSSKGSLRLCDMRAAALCDKHSKLFEEPE 

DPSNRSFFSEnS\SVSI)VKFSHSDRYMLTR.\DYLT 

VKVWDLNMEARPffiTYQVHDYLRSiaCa^ 

CIFDKFECAWNGSDRmMTGAYNNFFRMFDRNT 

K-RnVTLEASRGSSKPRAVL 


3688 


A 


1 


4.01 


KKVPGRLSHMSFblJlm:PANTOSP^ 
GLAAGIPLLVATALLVALLFTLIHRRRSSIEAMEE 
SDRPCEISETODNPKISENPRRSPTHEKNTMGAQE 
AHTYVKTVAGSEEPVHDRYRPTIEMERRR 


3689 

3690 


A 1 
A 


698 

61 


889 

153 


GRVLVHCAMGVSRSATLVLAFLMIYENMlLVtA 
TPnrTAGPPOISALTQAFVRQLOVLDNRLGRE 

^mTTahlvrrylgdasvhpdplqmptfpfuyui- 


3691 
3692 


A 
A 


61 
3 


153 
2831 


MGAHT.VRRYLGDASVEPDPLOMPTFPPDYGF 
PLVMa.LRQTLRRVGGARAVKiiAVMRAVLlWK 
DKAEHCINDIAFKPDGTQLILAAGSRLLVTOTSD 
GTLLQPLKGHKDTVYCVAYAKDGKRFASGSAD 
' KSVnWSKLEGILKYTHNDAIQCVSYNPITHQLA 
SCSSSDFGLWSPEQKSVSKHKSSSKIICCSWTNDG 
OYLALGMFNOnSIRNKNGEEKVKlERPGGSLSPI 
WSICWNPSSRWESFWMNRENEDAEDVIVNRYIQ 
EIPSTLKSAVYSSQGSEAEEEEPEEEDDSPRDDNL 
EERNDILAVADWG\QKVSFYQLSGKQIQKDRAL 

Sfdpccisyftkgeyillggsdkqvslftkdgvr 

LGTVGEQNSWVWTGQAKPDSNYWGGCQDGTI 
SFYQLIFSTVHGLYKDRYAYRDSMTDVIVQHLIT 

eokvrikckelvkkiaiyrnklaiqlpekiliyely 

Sm)LSD^fflYRVKElaIKKFECNLLWCAlfflI^X 

qekrlqclsfsgvkerewqmesliryika^ggpp 

GREGLLVGLKNGQILKIFVDNLFAIVLLKQATAy 
RCLDMSASRKKLAVVDENDTCLVYDIDTKELLF 
OEPNANSVAWNTQCEDMLCFSGGGYLNIKASTF 

pvhroklqgfwgyngskifclhvfsisavevpq 

SAPMYQYLDRKLFKEAYQIACLGVTDTDWRELA 

mealegldfetakkerkkrgetnndlfladws 

TOGKFHEAAKLYKRSGHENLALEMYTDLCMFE 

yakdflgsgdpketkmlitkqadwarnikepka 

AVEMYISAGEHVKAIEICGDHGWVDMLIDIARK 

lSreplllcatylkkldspgyaaetylkmg 

nTKSLVOLHVETORWDEAFALGEKHPEFKDDIY 

S^SXSendrfeeaqkafhkagrqreavqv 

liQLTNNAVAESRFNDAAYYYWMLSMQCLDIA 
- -StSi'VFHSNTSVSSLLHRPGHV lytjL imo 


3693 


A 


3 


1099 


fiWRIfflRPHTATOEWPFCT^ 1 
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Amino acid sequence (A^Alanine C'^Cysteine, D=Aspartic Acid, 
EX;iutamic Acid, F-Phenylalanine, G=Glycinc, H^'Histidine, 
I=l50leucine, K=Lysine, L=Leudne, M^Metbionine, 
N^Asparagine, P=Prolinc, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y«Tyrosine, 
X=Uoknown, *=Stop codon, /=pos$lble nucleotide deletion, 
V^possible nudeotide insertion 










PLRLDOnQWSYWAVFAPIWLWKLLWAGASVG 

AGVWARNPRYRTEGEACVEFKAMLIAVGIHLLL 

LMFEVf VrDRVERGTHFWLLVFMPLFFVSPVSV 

AACVWGFRHDRSLELEILCSVNILQFIFIALKLDRI 

IHWPWLVVFWLWILMSFLCLVVLYYTSWSLLFL 

RSLDWAEQRRTHVTMAISWITIVVPLLTFEVLL 

VHRLDGHNTFSYVSffYPLWLSLLTLMATTFRRK 

GGNHWWFAJRRDF/CQDQLPQPTGKPPPPPLTDH 

HGEKALPLQNKDRGSWPASRGSPRLL 


3694 


A 


483 


761 


PRSLIDYKSYMDTKLLVARFLEQSSCTMTPDIHE 
LVENIKSVLKSDEEHMEEAITSASFLEQIMAHSX 
QHIRAHKLPXETAGLXTSELRXLTP 


3695 


A 


483 


761 


PRSLIDYKSYMDTBCLLVARFLEQSSCTMTPDIHE 
LVENIKSVLKSDEEHMEEAITSASFLEQIMAHSX 
QHIRAHKLPXETAGLXTSELRXLTP 


36Q6 


A 


456 


733 


LSAALWEEPILSLWSETKELTNRGKMNYPOIGPH 
RPHVKGLRVRPGPGTLSNAPKSLCPGMSNSDRGI 
H\GGEGQGPGKRAGHLGRGGGMSFL 


3697 


A 


877 


1873 


VWL*TLS*HTCALMTVCRSCLVKYLEENNTCPT 
CRIVIHQSHPLQYIGHDRTMQDIVYKLVPGLQEA 
EMRKQREFYHKLGMEVPGDIKGETCSAKQHLDS 
HRNGETKADDS SNKE AAE 


3698 


A 


1 

1 .- . » ; 


572 


KQCGIPHEVVRDENSSVYAEVSRLLLATGHWKR 
LRRDNPRFNLMLGERNRLPFGRLGHEPGLVQLV 
NYYRGADKLCRKASLVKLIKTSPELAESCTWFPE 
. SYYIYPTNLKTPVAPAQNGIQPPISNSRTDEREFFL 
"ASYNRKkEEJGEGNVWIAKSSAGAKVWVOW*M 
TDLEEEIDIPSPVGLGLESEWPL 


3699 


A 


2008 


2432 


LHCKMGALETQTHPCSQNMLRSLQKCCCKVEE 

HHLQPVQVLQTLLHSATAGTGCRRPARPPPAPPT 

Fn>WRSRQSGKQSERAS*LKGRGRYGLGALGGR 

GGRALGGSRWPPPLPGETLFSGCKHRRRRRGSD 

AAPGEEAGT 


3700 


A 


33 


1318 


GYQIGMALASGPARRALAGSGQLGLGGFGAPRR 

GAYEWGVRSTRKSEPPPLDRVYEIPGLEPITFAG 

KMHFVPWLARPIFPPWDRGYKDPRFYRSPPLHE 

HPLYKDQACYIFHHRCRLLEGVKQALWLTKTKL 

lEGLPEKVLSLVDDPRNHIENQDECVLNVISHARL 

WQTTEEIPKRETYCPVIVDNLIQLCKSQILKHPSL 

ARRICVONSTFSATWNRESLLLOVRGSGGARLST 

KDPLPTL\SREEIEATKNHVLETFYPISPIIDLHECN 

lYDVKNDTGFQEGYPYPYPHTLYLLDKANLRPH 

RLQPDQLRAKMILFAFGSALAQARLLYGNDAKV 

LEOPVWOSVGTDGRVFHFLVFOLNTTDLDSNE 

GVKNLAWVDSDQLLYQHFWCLPVIKKRWVEP 

VGPVGFKPETFRKFLALYLHGAA 


3701 


A 


86 


465 


WTLCGPEAGMVGYDPJCPDGRNNTKFQVAVAGS 
VSGLVTRALISPFDVIKIRFQLQHERLSRSDPSAK 
YHGILQASRQILQEEGPTAFWKGHVPAQILSIGY 
GAVQFLSFEMLTELVHRGSVYDARE 


3702 


A 


166 


814 


GFWEKTNQSSHSMDPLGAPSQFVDVDTLPSWGD 
SCQDELNSSDTTAEIFQEDTVRSPFLYNKDVNGK 
VVLWKGDVALLNCTAT/NTSNESLTDKNPVSESI 
FMLAGPDLKEDLQKLKGCRTGEAQLTKGFNLAA 
RFIIHTVGPKYKSRYRTAAESSLYSCYR>rVLQLA 
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[mino acid sequence (A:=Alanine ^^J^y^J^f "^ ^>^^P?^^^^^^^^ 
r=GlutainiC AcjQ} j<— r nenywiamuc, \j 
=Isoleucine, K=Lyslne, U=Leueine, M=MethloBlne. 
t=Asparagine, P=Proline, Q=Glutamine, B-Arginlne, S=Senne, 
[^Threonine, V=Valine, W=Tryptophan, Y=TyrodBe, 
C=Unluiown, *=Stop eodon, A»posslWe nucleotide deletion, 
^possible nudeotide Insertion 








J 
] 


KEQSMSSVGI-CVINSAKRQYPLKDATHIALKIVK 
m.RIHGETIEKVV 


3703 


\ 1 


28 


1255 


SLGPSPKSATEPCCGDTMAPHEDAGGEALGUSi' W 

EAGNYRRTVQRVEDGHRLCGDLVSCFQERARIE 

KAYAOOLADWARKWRGTVEKGPQYGTLEKAW 

HAFFTAAERLSALHLEVREKLQGQDSERVRAWQ 

RGAFHRPVLGGFRESRAAEDGFRKAQKPWLKRL 

KEVEASKKSYHAARKDEKTAQTOESHAKADSA 

VSOEQLRKLQERVERCAKEAEKTKAQYEQTLAE 

UmYTPRYMEDMBQAFETCQAAERQRLLFFKD 

MLLTLHQHLDLSSSEKFHELHRDLHQGIE^^^ 

EDLRWWRSraOPGMAMNWPQFEEWSLDTQRTI 

SRKEKGGRSPDEVTLTSIVPTRDGTAPPPQSPGSP 

GTGODEEWSDEESP 


3704 


A 


1 


271 


ARGEDLALATGGGPDTVTHSNMPCPNbLVYDC 
WLNIKECSVGEHTFEDLGLCPGRNQREKKRSYK 
DFT .REEEKIAAOVRNSSKKKLKDSE 


3705 


A 


170 


1318 


LNWANLVIMWPREEEKBKVQDYSLGGLSFULKi 

DVSRKKKILKAYDEDEDEDLYPDIHPPPSLPLPG 

OFTCPQCRKSFTRRSFRPNLQLANMVQIIRQMCP 

TPYRGNRSNDQGMCFKHQEALKLFCEVDKEAIC 

VVCRESRSHKQHSVLPLEEWQEYKAKLQGHVE 

PLRKHLEAVQKMKAKEERRVTELKSQMKSELA 

AVASEFGRLTRFLAEEQAGLERRLREMHEAQLG 

RAGAAASRLAEQAAQLSRLLAEAQERSQQGGLR 

LLQDIKE1FNRCEEVQLQPPEVWSPDPCQPHSHD 

FLTDAIVRKMSRMFCQAARVDLTLDPDTAHPAL 

MLSPDRRGVRLAERRQEVADHPKRFSADGCVLG 

AOGFRSGRHYAVEVCMGP 


3706 


A 


204 


1996 


SRERQrrWMDHNFAPAFPHM(jSHOAPGPGTSFS 

HSHVLGRPIRPSRLPGGGSPLTPVLRKTIHLD'nFP 

OSHIPQTSSRLGLGARTRSVPPQETGIALGASLSP 

LPTSSLVPRKLSSISLTLHQNSQARSLDRPLSHWE 

ELPTPGKKAAPHEGGRVSSPGSPPVTLVPGGRVH 

SBGPGNPGLTKSNRMLATEiCPLVSSYLALPFQSR 

LAOSAPVLAEPGSLGQGHLVSVTDHMPTRASPG 

KGlb>RARGIPRPRGRLQRANTTVNLTAl^TRTD 

AARHLATMATNRPSLAINLATPNTSQLDTGT^P 

ALDIKIX3TARDLSSVGTVKSGKTV>nJVTAGTrKP 

GTAM>tt-TTVGTTKPGMVMDLIASEPDKLGKAM 

ATRSTAKPDMTTEGIAMDSATSDPVKPDTITATV. 

GTS1U.ETAMAUUIVNRAKLGTAKNSI^DTSR 

MGTAVGSVVPVTPDPATGKTTLGSVNNLTISDV 

ATCLLMPSRSTDLALDNTNAAMDRATEPASLDL 

AIEYKGKCRNLVGDGLGCREGEVCELGDGSMK 

PMSINSNLLGYIGIDTIIEQMRKKTMKTGFDFNIM 

\"t'r,TFn'-0 A A Ar.T.VAGSTKDPISFPQ 


3707 


A 


3 


549 


- ■ SSSISRDFLGQAACASOTMLRWLRDFVLP 1 AACt^ 
DAEOPMRYETLFQALDRNGDGVVDIGELQEGLR 
NLGffLGODAEEKEFTTGDVNKDGKLDFEEFMKY 
LKDHEKKMKLAFKSLDKNNDGKIEASEIVQSLQ 
TLGLTISEQQABLILQSIDVDGTMTVDWNEWRD 

YFLFNPVTDIEEIIR 


3708 


A 


1 


1866 


- ■ EFRGAGRAMMLAPRGAAVLLLHLVLQRWLAAU 
AOATPQVFDLLPSSSQRLNPGALLPVLTDPALND 
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£=GIutamic Add, F=^PhenyIaIanine, G^lydne, H=Histidioe, 
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X-Unknown, *=Stop codon, A=possiblc nudeotide deletion, 
^possible nudeotide insertion 










LYVISTFKLQTKSSATIFGLYSSTDNSKYFEFTVM 

GRLSKAILRYLKNDGKVHLVVFNNLQLADGRRH 

RILLRLSNLQRGAGSLELYLDCIQVDSVHNLPRA 

FAGPSQKPEHELRTFQRKPQDFLEELKLVVRGSL 

FQVASLQDCFLQQSEPLAATGTGDFNRQFLGQM 

TQLNQLLGEVKDLLRQEVNETSFLRNTITECQAC 

GPLKFQSPTPSTVVPPASPAPPTRPPRRCDSNPCF 

RGVQCTDSRDGFQCGPCPEGYTGNGITCIDVDEC 

KYHPCYPGEHCINLSPGFRCDACPVGFTGPMVQ 

GVGISFAKSNKQVCTDIDECRNGACVPNSICVNT 

PCSVNAQCIEERQGDVTCVCGVGWAGDGYICGK 

DVDIDSYPDEELPCSARNCKKDNCKYVPNSGQE 

DADRDGlGDACDEDADGDGILNEODNCVLTPfNV 

DQRNSDKDIFGDACDNCLSVLNNDQKDTDGDG 

RGDACDDDMDGDGIKNILDNCPKFPNRDQRDK 

DGDGVGDACDSCPDVSNPNQ 


3709 


A 


144 


417 


TQAMEGLLHYINPAHAISLLSALNEERLKGQLCD 
SQTVFQLDFCEPDAFDNVLNYIY 


3710 


A 


245 


68S 


FGMLKNKGHSSKKDNLAVNAVALQDHILHDLQ 

LRI^SVADHSKTQVQKKENKSLKRDTKAIIDTGL 

KKTTQCPKLEDSEKEYVLDPKPPPLTLAQKLGLI 

GPPPPPLSSDEWEKVKQRSLLQGDSVQPCPICKE 

EFELRPQVFSIRG 


37LL 


A 

r 


3 . 


773 

A. 


SLEMSSDGEPLSRMDSEDSISSTIMpyDSTISSGRS 
Ti^ AA/FN/n^Gr^f^^inr^s^ifWTA vwrr ppmc- 

J. jr/\lVliVllN VJV^VJO 1 1 OOoJN^^ Irx I IN V'^-' W L/V^.V^V^/\V^nN o 

SPDLADHIRSIHVDGQRGGVFVCLWKGCKVYNT 
PSTSQSWLQRHMLTHSGDKPFKCVVGGCNASFA 
SQGGLARHVPTHFSQQNSSKVSSQPKAKEESPSK 
AGMNKRRKLlC>nCRRRSLARPHDFFD AOTLD A TR 
HRAICFNLSAfflESLGKGHSVVFHSTVSILLFFQIK 
YKTLQKNISTIISKSLKI 


3712 


A 


2 


344 


RATWHNAGKEREAVQLMAGAEKRVKASHSFLR 
GLFGGNTRffiEACEMYTRAANMFKMAKlWSAA 
GNAFCQAAKLHMQLQSKHDSATSFVDAGNAYK 
KADPQGKTARHVACYLCV 


3713 


A 


20 


974 


GAAATACSSSSSSSGAPATWAAHGPGKDVASPS 

SVSLSPRRSRLLVLRCGLRRNPERPSSSPALRRLL 

LLLLLLLLLLLGFLLSPGPERGVGGGRFGRRLAL 

LWAAALGHVVSGKVMSRRAPGSRI ^SGGGGGG 

TNYSRSWNDWQPRTDSASADPGNLKYSSSRDRG 

GSSSYGLQPSNSAWSRQRHDDTRVHADIQNDE 

KGGYSVNGGSGENTYGRKSLGOELRVNNVTSPE 

FTSVQHGSRALAIXDMRKSQERSMSYCDESRLS 

YLLRRJTRENDRDRRLATVKQLKEFIQQPENKLV 

LVKQLDILAAVHDVLNER 


3714 


A 


237 


458 


IFALKSPSYLLPCCTPEGKMDHKQLCWSHPQKSG 
QSSRSCCICSNQHGLIWKYSLNMCLQCCHQYVK 
DIGFUCL 


3715 


A 


970 


1524 


LCTLSPGISGTAGSCLTTEPGTELGTSFAQNGFYH 

EAWLFTQALKLNPQDHRLFGNRSFCHERLGQP 

AWALADAQVALTLRPGWPRGLFRLGKALMGLQ 

RFREAAAVFQETLRGGSQPDAARELRSCLLHLTL 

QGQRGGICAPPLSPGALQPLPHAELAPSGLPSLRC 
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.mino acid sequence (A=Alanine C=Lyste»ne, ^^Aspanic a^^^ 
^Glutamic Add, F^Phenylalanine, G=Glycmc, H=Histidme, 
^Isolcudne, K-Lysinc, I^Leudnc, M=Methiooine, ^ 
J=Asparagine, P=Pro!ine, Q=Glntamme, R=Arginine, S^nc. 
r=Thrconine, V=Valine, W=Tryptophan, Y^Tyrosme, 
C=Unknown, *=Stop codon, /=posablc nucleotide deletion, 
=possii>ie nucjcouijc niaH 

PR55TAT.RSPGLSPLLH 


3716 


A \ 


55 


308 ^ 


OGLPSTMVKLGCSFSOKPCiKDPGDQDGAAMUS 
VPLISPLDISQLQPPLPDQWKTQTEYQLSSPDQQ 

NYTKSR „^.^r.nvr.,/-p- 


3717 


A 


58 


618 


GAGCtSPGLWARKAAARCLFlYPSRAQPSNVGK 

RRRRia>GLGALAAGVPAMAESVERLQQRVQELE 

RELAQERSLQVPRSGDGGGGRVRIEKMSSEWD 

SNPYSRLMALKRMGIVSDYEKIRTFAVArVGVGG 

VGSVTAEMLTRCGIGKLLLFDYDKVELANMNRL 

FFOPHOAGLSKVOAAGHTPEE 


3718 


A 


3 


593 


RGAGGRAGGRADGQPNMAUQRQRSLS 1 i,(jbi>L 

YHVLGLDKNATSDDDCKSYRKLALKYHPDKNPD 

NPEAADKFKEINNAHAILTDATKRMYDKyGSLG 

LYVAEQFGEENVNTYFVLSSWWAKALFVFCGLL 

TCCYCCCCLCCCFNCCCGKCKPKAPEGEETEFY 

YC5ppT>^ p/^r>i nsnKRRATDTPIVIOPASATEP 


3719 


A 


2 


2173 


iGGVmGSRADGPRl-SGHVlGKMAVFFWHSiyN 
RJJYKAEFASCRLEAVPLEFQDYHPLKPITVTESK 
TKKVNRKGSTSSTSSSSSSSWDPLSSVLDGTDPL 
SMFAATADPAALAAAMDSSRRKRDRDDNSWG 
SDFEPWTNKRGEILARYTTTEKLSINLFMGSEKG 
KAGTATLAMSEKVRTRLEELDDFEEGSQKELLN 
LTOODYVNRIEELNQSLKDAWASDQKVKAPKN 
VHPGKLVYERIFSMCVDSRSVLPDHFSPENANDT 
AKETCLNWFFKIASIRELIPRFYVEASILKCNKFLS 
KTGISECLPRLTCMIRGIGDPL\GSVYARAYL\SRV 
GMEVAPHLKETLMKNFFDFLLTFKQfflGDTVQN 
OLVVQGVELPSYLPLYPPAMDWIFQCISYHAPEA 
LLTEMMERCKKLGNNALLLNSVMSAFRAEFIAT 
RSMDFIGMIKECDESGFPKHLLFRSLGLNLALAD 
PPESDRLQILNEAWKVITKLKNPQDYINCAEVWV 
EYTCKHFTKREVNTVLADVIKHMTPDRAFEDSY 
POLOLIIKKVIAHFHDFSVLFSVEKFLPFLDMFQK 
ESVRVEVCKCIVRTPLSSINKSPPRTRSS*MPFCMF 
ARPCMTL/CNALTLEDEKRMLSYLINGFIKMVSF 
GRDFEQQLSFYVESRSMFCNLEPVLVQLIHSVNR 
LAMEreKVMKGNHSRKTAAFVRSWGAYWFITIP 

«?T.AGIFTRLNLYLHSG 


3720 


1 A 

A 




296 


- ©nmCTAFSOlRSrFYlSKTYC^Fi^^^ 
ADFNSKGTRDYSPRQMAVRE/KVFDVURCFKRH 

r.APVTDTPVFELKVRNGQEETTW 

" " PSCLTCVGHCSIGGSCI MlGIMMPliCHUSbHMTG 
PRCEEHVFILQQPGHIASILIPLLVLLLLALVAGW 
FWHKRRVQGAKGFQHQRMTNGAMNVEIGNPTY 


3721 


TA 




""3T0 


3722 


A 


/ J 


722 


- MELVAGCYEQVLFGFAVHPHPEACGUHbQWlL 
VADFTHHAHTASLSAVAVNSRFVVTGSKDETIHI 
YDMKKKmHGALVHHSGmCI^TCNm^^^^ 
EDGLICIWDAKKWECLKSIKAHKGQVTFLSIHPS 
GKLALSVGTDKTLRTWNLVEGRSAFIKNIKQNA 
fflVEWSPRGEQYVVnQNKTOIYQLDTASISGTlTN 

TJTfUTSSVKFLSES 


3723 


A 


110 


316 


" TMSDNRRSGGLHGLAbKCPNLTYLNLSONKJf. 
c^^j^^M ^^'Jr.TVT.ST.DLLFLVKFSEICLCLLISI 


1 3724 


___A 


_ J 


406 


--^VDRGTEAWORDPAFSGLgRVGGVDVS^VfLWS | 
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SEQQ) 
NO: 


Method 


Predicted 

beginiuDg 

nndeotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C^'CysteinCt D-Aspartic Add, 
E=Glutamic Acid, f^Pbenylalanine, G='GIycine, H«Histidine, 
I^'IsoIeudne, K=Lysine, L>=I/eudne, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginlne, S=^erine, 
T=Thrconine, V=VaIine, W=Tryptopban, Y=Tyrosine, 
X=Unknown, *=Stop codoo, /^possible nudcodde deletion, 
\=pos5ible nndeotide insertion 










WACASLGVLSFPELEVVYEESRMVSLTAPYVSG 
FLAFREVPFLLELVQQLREKEPGLMPQVLLVDGN 
GVLHHRGFGVACHLGVLTDLPCVGVAKKLLOV 
DG 


3725 


A 


.3 


406 


VDRGTEAWQRDFAFSGLQRVGGVDVSFYKGDS 

VRACASLGVLSFPELEWYEESRMVSLTAPYVSG 

FLAFREVPFLLELVQQLREKEPGLMPQVLLVDGN 

GVLHHRGFGVACHLGVLTDLPCVGVAKKLLOV 

DG 


3726 


A 


1 


433 


SSDDRSLFRRLKLNYAIFDEGHMLKNMGSIRYQ 

HLMTINANNRLLLTGTPVQNNLLELMSLLNFVM 
PHMFSSSTSEIRRMFSSKTKSADEOSIYEKERIAH 
AKQHKPFILRRVKEEVLKQLPPKKDRIELCAMSE 
KQEQLYLG 


3727 


A 


6 


383 


RIPRGKACXTVLGRSTGELEGFASSRLPPQPCGW 
GQSSDLLSRIDLDELMKKDEPPLDFPDTLEGFEY 
AFNEKGQLRHIKTGEPFVFNYREHLHRWNQKRY 
EALGEnTKYVYELLEKDCNSKKVS 


3728 


A 


3 


2452 


EIAGAAAENMLGSLLCLPGSGSVLLDPCTGSTISE 

TTSEAWSVEVLPSDSEAPDLKQEERLQELESCSG 

LGSTSDDTDVREVSSRPSTPGLSWSGISATSEDIP 

NKIEDLRSECSSDFGGKDSVtSPDMDEITHDFLYI 

LQPKQHFQ^^EAEADMRIQLSSSAHQLTSPPSQSE 

SLLAMFDPLSSHEGASAVVRPKVHYARPSHPPPD 

PPILEGAVGGNEARLPNFGSPMF*LPAEMEAFKQ 

RHSA'TPERLVRSRSSXDIVSSVRRPMSDPSWNRR 

p(GNEERELPPAAAIGATSLVi^ 

RGETEERKDSDDEKSDR>JRPWWRKRFVSAMPK 

APIPFRKKEKQEKDKDDLGPDRFSTLTDDPSPRLS 

AQAQVAEDELDKYRNAIKRTSPSDGAMANYEST 

EVMGDGESAHDSPRDEALQNISADDLPDSASQA 

AHPQDSAFSYRDAKKKLRLALCSADSVAFPVLTV 

HSTRNGLPDHTDPEDNEIVCFLKVQIAEAINLQD 

KNLMAQLQETMRCVCRFDNRTCRKLLASIAEDY 

RKRAPYIAYLTRCRQGLQTTQAHLERLLQRVLR 

DKEVANRYFTTVCVRLLLESKEKKIREnQDFQK 

ltaaddktaovedfloflygamaodviwonas 

EEQLQDAQLAIERSVMNRIFBCIj\FYPNQDGDILR 

DQVLHEmQRLSKVVTANHRALQIPEVYLREAP 

WPSAQSEIRTISAYKTPRDKVQCILRMCSTIMNLL 

SLANEDSVPGADDFVPVLVFVLIKANPPCLLSTV 

QYISSFYASCLSGEESYWWMQFTAAVEFIKTIDD 

RK 


3729 


A 


3 


2452 


EIAGAAAENMLGSLLCLPGSGSVLLDPCTGSnSE 

TTSEAWSVEVLPSDSEAPDLKQEERLQELESCSG 

LGSTSDDTDVREVSSRPSTPGLSWSGISATSEDIP 

NKffiDLRSECSSDFGGKDSVTSPDMDEITHDFLYI 

LQPKQHFQHIEAEADMRIQLSSSAHQLTSPPSQSE 

SLLAMFDPLSSHEGASAVVRPKVHYARPSHiPPPD 

PPILEGAVGGNEARLPNFGSPMF*LPAEMEAFKQ 

RHS/YTPERLVRSRSS\DIVSSVRRPMSDPSWNRR 

PXGNEERELPPAAAIGATSLVAAPHSSSSSPSKDSS 

RGETEERKDSDDEKSDR^IRPWWRKRFVSAMPK 

APPFRKKEKQEKDKDDLGPDRFSTLTDDPSPRLS 

AQAQVAEDELDKYRNAKRTSPSDGAMANYEST 
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NO: 


1.CU1VU 1 * 

h 
I ' 

1 

i 
i 
i 
1 

1 1 


Yedicted 1 i' 
»eginniog n 
ludeotide 1 
ocation c 
.orresponding t 
o first amino i 
idd residue of | 
peptide s 
sequence 


Predicted end / 
udeotide 1 
ocation ^ 
^rresponding f 
0 last amino 
icid residue of ^ 
>eptide N 
equence 


Lmino add sequence (A=Alan.ne C=c:ysttme, D'Aspartic A^^^^ 
>Glutaniic Acid, F=PhenyIaIanine, t»-t»iycmc, M-msnuiire, 
=Isoleudne.K=Lysine,l^Leucine,M=Methioiiine, 
1=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S-Sertne, 
r=Threonine, V=Valine, W=Tryptophaii, Y=Tyrosine, 
{=Unknown, *=Stop codon, ^possible nucleotide deletion, 
Fpossible nucleotide insertion 










EVMGDGESAHDSPRDEALQNISADDLPDSASyA 

AHPQDSAFSYRDAKKKLRLALCSADSVAFPVL1\ 

HSTWIGLPDHIDPEDNEIVCFLKVQIAEAINLQD 

KNLMAQLQETMRCVCRFDNRTCRKLLASIAEDY 

RKRAPYIAYLTRCRQGLQTTQAHLERLLQRVLR 

DKEVANRYFTTVCVRLLLESKEKKIREFIQDFQK 

LTAADDKTAQVEDFLQFLYGAMAQDVIWQNAS 

EEOLQDAQLAIERSVMNRIFKLAFYPNQDGDILR 

DQVLHEfflQRLSKVVTANHRALQIPEVYLREAP 

WPSAQSEIRHSAYKTPRDKVQCILRMCSTIMNLL 

SLANEDSVPGADDFVPVLVFVLKAWPCLLSTV 

QYISSFYASCLSGEESYWWMQFTAAVEFIKTIDD 


3730 


A 


3 


2452 


EIAGAAAENMLGSLLCLPGSUSVLLDPCTOSTlSli 

TTSEAWSVEVLPSDSEAPDLKQEERLQELESCSG 

LGSTSDDTDVREVSSRPSTPGLSWSGISATSEDIP 

NKIEDLRSECSSDFGGKDSVTSPDMDEITHDFLYI 

LOPKQHFQHIEAEADMRIQLSSSAHQLTSPPSQSE 

SLLAMFDPLSSHEGASAVVRPKVHYARPSHPPPD 

PPILEGAVGGNEARLPNFGSPMF*LPAEMEAFKQ 

RHSA^TPERLVRSRSSVDIVSSVRRPMSDPSWNRR 

P\GNEERELPPAAAIQATSLVAAPHSSSSSPSKDSS 

RGETEERKDSDDEKSDRNRPWWRKRFVSAMPK 

APIPFRKKEKQEKDKDDLGPDRFSTLTDDPSPRLS 

AOAQVAEDILDKYRNAIKRTSPSDGAMANYEST 

EVMGDGESAHDSPRDEALQNISADDLPDSASQA 

AHPQDSAFSYRDAKKKLRLALCSADSVAFPVLT^ 

HSTRNGLPDHTDPEDNEIVCFLKVQIAEAINLQD 

KNLMAQLQETMRCVCRFDNRTCRKLLASIAEDY 

RKRAPYIAYLTRCRQGLQTTQAHLERLLQRVLR 

DKEVANRYFTTVCVRLLLESKEKKIREFIQDFQK 

LTAADDKTAQVEDFLQFLYGAMAQDVIWQNAS 

EEOLQDAQLAffiRSVMNRIFKLAFYPNQDGDILR 

DOVLHEfflQRLSKVVTANHRALQIPEVYLREAP 

WSAQSEIR'nSAYKTPRDKVQCE.RMCSTIMNLL 

SLANEDSVPGADDFVPVLVFVLIKANPPCLLSTV 

QYISSFYASCLSGEESYWWMQFTAAVEFIKTIDD 

RK 


3731 


A 


1 


1305 


- VNTAMHEAKLMEECDELVEUQQRKQMIAVKJK 

ETKVMKLRKLAQQVANCRQCLERSrVLINQAEH 

ILKENDQARFLQSAKNIAERVAMATASSQVLIPDI 

NFNDAFENFALDFSREKKLLEGLDYLTAPNPPSIR 

EELCTASHDTITVHWISDDEFSISSYELQYTIFTGQ 

ANFISLYNSVDSWMrVPNKQNHYTVHGLQSGTR 

YiFIVKAINQAGSRNSEPmKTNSQPFKLDPKMT 

HKKLKISNDGLQMEKDESSLKKSHIPERFSGTGC 

YVYGVLHNSDNS*MHSLSFPLSHRYAIGIAYKSA 

PK1^MGKNASSWVFSRCNS>«^VRHNNKEML 

VDVPPHLKRLGVLLDYDNY/NMLSFYDPANSLVH 

, T«rm->irmrT Dx/r-TJTmwi^JKSLMILSGLPAPDFI 
LHTFDVTrULJr V (^i^ li wi^jvojjivAii-»»j^*^* «•* - 

nvPKROECNCRPOESPYVSGMKTCH 


3732 


A 


127 


2832 


- t GORLSLVPRPSLKRRLUKKLSLGLRERMMSLW 
WS/GPKVRTQATTGARPKTETKSVPAARPKTEAQ 
AMSGARPKTEVQVMGGARPKTEAQGITGARPKT 
DARAVGGARSKTDAKAIPGARPKDEAQAWAQS 
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SEQW 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc C=Cysteine, l>=Aspartic Acid, 
£=GIntamic Acid, F^PlicnylaJaninc, G=Glycine, H^Histidine, 
I-Isoleucine, K^Lysinc, I/=I^acine, M-Methionine, 
N»Asparagine, P=Prolinc, Q^GIutamine, R=Arginine, S^erine, 
T^Threonine, V=Valine, W=Tryptophan, Y^Xyrosinc, 
X^Unlcnown, *s:Stop codon, A^possible nucleotide deletion^ 
\Fpossible nucleotide insertion 










EFGTEAVSQAEGVSQTNAVAWPLATAESGSVTK 

SK\ACLWIEN*SMWM/PETFPGTQGQKGIQPWFG 

PGEETNMGSWCYSRPRAREEASNESGFWSADET 

STASSFWTGEETSVRSWPREESNTRSRHRAKHQT 

NPRSRPRSKQEAYVDSWSGSEDEASNPFSFWVG 

E^^TNNLFRPRVREEANIRSKLRTNREDCFESESED 

EFYKQSWVLPGEEAN\mSGTETKKILILPWKLRA 

QKDVDSDRVKQEPRFEEEVnGSWFWAEKEASLE 

GGASAICESEPGTEEGAIGGSAYWAEEKSSLGAV 

AJREEAKPESEEEAIFGSWFWDRDEACFDLNPCPV 

YKVSDRFRDAAEELNASSRPQTWDEVTVEFKPG 

LFHGVGFRSTSPFGIPEEASEMLEAKPKNLELSPE 

GEEQESLLQPDQPSPEFTFQYDPSYRSVREEREHL 

RARESAESESWSCSCIQCELKIGSEEFEEFLLLMD 

KIRDPFIHEISKIAMGMRSASQFTRDFIRDSGVVS 

LffiTLLNYPSSRVRTSFLENMIHMAPPYPNLNMIE 

IDYHT\LUN*YGPGFPLLF*PQAQCGETKFHVLK 

MLLNLSENPAVAKKLFSAKALSIFVGLFNIEETN 

DMQIVIKMFQNISNIIKSGKMSLIDDDFSLEPLISA 

FREFEELAKQLQAQIDNQNDPEATGTTAFVGKG 

NNPSANRERLSPSVFCPGAQEAESLPARRVRGEE 

QRLLLEEVGARTADGIPEGW 


3733 


A 


2 


3274 


DVPLIRIEEDTGEIFTTGARIDREKLCAGIPRDEHC 

FYEVEVAELPDEIFRLVKIRFLIEDINDNAPLFPAT 

VINISIPENSAINSKYTLPAAVDPDVGINGYQNYE 

LIKSQNIFGIjD VffiTPGGDKMPQL WQKELb 

DTYVlVlKVkvEDGGFPQRSSTAILQVSVTDTNDN 

HPVFKETEDBVSIPENAPVGTSVTQLHATDADIGE 

NAKIHFSFSNLVSNIARRLFHLNATTGLITIKEPLD 

REETPNHKLLVLASDGGLMPARAMVLVNVTDV 

NDNWSromYIVNPVNDTVVLSENIPLNTKIALIT 

VTDKDADHNGRVTCFTDHEIPFRLRPVFSNQFLL 

ETAAYLDYESTKEYAIKLLA\ADAGKPPLNQSAM 

LFIKVKDENDNAPVFTQSFVTVSIPENNSPGIQLT 

KVSAMDADSGPNAKINYLLGPDAPPEFSLDCRT 

GMLTVVKKLDREKEDKYLFTILAKDNGVPPLTS 

^^VTVFVSIIDQ]^NSPWTHNEYNFYVPENLPRH 

GTVGLITVTDPDYGDNSAVTLSILDENDDFTIDSQ 

TGVIRPNISFDREKQESYTFYVKAEDGGRVSRSSS 

AKVTINVVDVNDNKPVFIVPPSNCSYELVLPSTN 

PGTVVFQVIAVDNDTGMNAEVRYSIVGGNTRDL 

FAIDQETGNITLMEKCDVTDLGLHRVLVKANDL 

GQPDSLFSWIVNLFVNESVTNATLINELVPQKH 

LKHQ*PQILEIADVSSPTSDYVKILVAAVAGTITV 

VWIFITAVVRCRQAPHLKAAQKNMQNSEWATP 

^^PE^nRQMIMMKKKKKKKKHSPK]sI^ 

TXADDVDSDGNRVTLDLPIDLEEQTMGKYNWV 

TTPTTFKPDSPDLARHYKSASPQPAFQIQPETPLN 

LKHHnQELPLDNTFVACDSISNCSSSSSDPYSVSD 

CGYPV1TFEVPVSVHTRPPVDLEVGGAQSGQVAI 

LTSSLMELLLCLMVAAFLPLELRPLGQQNVMSW 

EQEAKILLVGYWGDGEWCHFHFHHLEPGPVNPG 

YERKQYHILDSDSEDTQPSGELCPIPVRPFTILSIQ 

LLQDDGEHCGTKQGFQPAVQLGLLPHKTLK 
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1 SEQID I 
NO: 


Vfethod 1 J 
1 
1 
1 

1 


Predicted 1 
beginning > 
nucleotide 1 
ocation 
corresponding 
to first amino 
add residue of 
peptide 
sequence 


Predicted end > 
lucleotide 
ocation 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Vmino add sequence (A=Alanlne C=Cysteine, D=Aspartic Add, 
t?.^i.,i^ww*t^ AriA KWPhpnvlfllanine. G^GIydne, H^Histidine, 
NIsoIcudne,K=Lysine,l/=Lencine,M«Methionlne, 
S-Asparagine, P=Proline, Q=GlutaniiDe, R-Arginlne, S=Serinc, 
r=Threoninc V^Valine, W=Tryptoph8n, Y^iyrosine, 
X^Unknown, *=StDp codon, /^possible nudcotlde dclcHon, 
^-possible nudeotide insertion 


3734 


A 


1 


840 


GTRPGHLPAPSDGFCV/HL*SIPSWGSF»GESL/EM 

QLITSLGLQEFDIARNVLEUYAQTLVWIGIFFCPL 

LPnOMIMLFIMFYSJCNISLMMNFQPPSKAWRAS 

QMMITFIFLLFFPSFTGVLCTLAITIWRLKPSADC 

GPFRGLPUTHSIYSWIDTLSTRPGYLWVVWIYRN 

LIGSWFFILTLIVLnTYLYWQIlEGRKIMIRLLH 

EQIINEGKDKMFLIEKLIKLQDMEKKANPSSLVLE 

RREVEQQGFLHLGEHDQSLDLRSRRSVQEGNPR 


3735 


A 


2 


432 


VEVCRRYLWKMTVDASQNVQCCVIFSHFPHFN 
NI^KKLLHTDTLLKIESKKHKAYLRSAAIEEERE 
SEFALRPTFDLTVRRNHLIEDVLNQLSQFENEDL 
RKELWVSFSGEIGYDLGGSA^KKEIFYCLFAEMIQ 

PEYGMFMY 


3736 


A 


1542 


343 


KGAPSFVRLYQYPNFAGPHAALANKSFFKAUKV 

TMLWNKKATAVLVIASTDVDKTGASYYGEQTL 

HYIATNGESAVVQLPKNGPIYDWWNSSSTEFCA 

VYGFMPAKATIFNLKCDPVFDFGTGPKNAAYYS 

PHGHILVLAGFGNLILQPAD/IMKVWNVKNYKLI 

SKPVASDSTYFAWCPDGEHILTATCAPRLRVNN 

GYKIWHYTGSILHKYDVPSNAELWQVSWQPFLD 

GIFPAKTITYQAVPSEVPNEBPKVATAYRPPALRN 

KPITNSKLHEEEPPQNMKPQSGNDKPLSKTALKN 

QRKHEAKKAAKQEARSDKSPDLAPTPAPQSTPR 

NWSQSISGDPEroKKIKNLKKKLKAIEQLKEQAA 

TGKOLEKNOLEKIQKETALLOELEDLELGI 


3737 


A 


, 3I9Q . V' 


664 


VAMGTPRAQHPPPPQLLFLJLLSCPWIQGLPLKEE 

EILPEPGSETPTVASEALAELLHGALLRRGPEMG 

YLPGPPLGPEGGEEETTTTIITTTTVTTTVTSPVLC 

NNNISEQEGYVESPDLGSPVSRTLGLLDCTYSIHV 

YPGYGIEIQVQTLNLSQEEELLVLAGGGSPGLAP 

RLLANSSMLGEGQVLRSPTNRLLLHFQSPRVPRG 

GGFRIHYQAYLLSCGFPPRPAHGDVSVTDLHPQG 

TATFHCDSGYQLQGEETLICLNGTRPSWNGEIPS 

CMASCGGTIHNATLGRIVSPEPGGAVGPNLTCR 

WVIEAAEGRRLHLHFERVSLDEDNDRLMVRSGG 

SPLSPVIYDSDMDDVPERGLISDAQSLYVELLSET 

PANPLLLSLRFEAEBBDRCFAPFLAHGNVTTTDPE 

YRPGALATFSCLPGYALEPPGPPNAIECVDPTEPH 

WNDTEPACBCAMCGGELSEPAGWLSPDWPQSY 

SPGQDCVWGVHVQEEKRILLQVEILNVREGDML 

TLFDGDGPSARVLAQLRGPQPRRRLLSSGPDLTL 

QFQAPPGPPNPGLGQGFVLHFKEVPRNDTCPELP 

PPEWGWRTASHGDLIRGTVLTYQCEPGYELLGS 

DEuTCQWDLSWSAAPPACQKIMTCADPGEIANG 

HRTASDAGFPVGSHVQYRCLPGYSLEGAAMLTC 

YSRDTGTPKWSDRVPKCALKYEPCLNPGVPENG 

YQTLYKHHYQAGESLRFFCYEGFELIGEVTITCV 

PGHPSQWTSQPPLCKVTQTTDPSRQLEGGNLAL 

ATT f T>T fJT VTVT GSGVYIYYTBCLOGKSLFGFSGSH 

SYSPITVESDFSNPLYEAGDTREYEVSI 


3738 


A 


3190 


664 


" ■ VAMGTPRAQHPPPPQLLFLILLSCPWlQGLPLUJiJi 
EILPEPGSETPTVASEALAELLHGALLRRGPEMG 
YLPGPPLGPEGGEEETTTrarnrVTTTVTSPVLC 
NNNISEGEQYVESPDLGSPVSRTLGLLDCTYSIHV 
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NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide , 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=:Asparttc Acid, 
E=Glutamic Add, F=Phenylalanine, GKSIycine, H»Hi5tidine, 
I-Isoleucine, K-Lysine, L=Lendne, M=Methionine, 
N=Asparagine, P«^Prolinc, Q^lntamine, RBArginine, S=Serine, 
T=Threonine, V^Valinc, W^Tryptophan, Y^Tyrosine, 
X-Uttknown, *^top codon, ^=possibie nudeotide deletion, 
\=possible nudeotide insertion 










YPGYGIEIQVQTLKLSQEEELLVLAGGGSPGLAP 

RLLANSSMLGEGQVLRSPTNRLLLHFQSPRVPRG 

GGFRIHYQAYLLSCGFPPRPAHGDVSVTDLHPGG 

TATFHCDSGYQLQGEEILICLNGTRPSWNGETPS 

CMASCGGTIHNATLGRIVSPEPGGAVGPNLTCR 

WVIEAAEGRRLHLHFERVSLDED>JDRLMVRSGG 

SPLSPVIYDSDMDDVPERGLISDAQSLYVELLSET 

PAhnPLLLSLRFEAPEEDRCFAPFLAHGNVTTTDPE 

YRPGALATFSCLPGYALEPPGPPNAIECVDPTEPH 

WNDTEPACKAMCGGELSEPAGWLSPDWPQSY 

SPGQDCVWGVHVQEEKRILLQVEILNVREGDML 

TLFDGDGPSARVLAQLRGPQPRRRLLSSGPDLTL 

QFQAPPGPPNPGLGQGFVLHFKEVPRNDTCPELP 

PPEWGWRTASHGDLIRGTVLTYQCEPGYELLGS 

DILTCQWDLSWSAAPPACQKIMTCADPGEIANG 

HRTASDAGFPVGSHVQYRCLPGYSLEGAAMLTC 

YSRDTGTPKWSDRVPKCALKYEPCLNPGVPENG 

YQTLYKHHYQAGESLRFFCYEGFELIGEVTITCV 

PGHPSQWTSQPPLCKVTQTTDPSRQLEGGNLAL 

AILLPLGLVIVLGSGVYrYYTBa.QGKSLFGFSGSH 

SYSPITVESDFSNPLYEAGDTREYEVSI 


3739 


A 


734 


445 


LLEPEPAEEYTEQSEVEST/EGMILI*CCLYFAAFQ 
TbA^SNIYFALQYVNRQFMAETQFTSGEKEQVDE 
WTVETVEVRVLCIAKLLSLSSVSNFYLY 


3740 


A 


2 


1578 


MAHYITFLCMVLVLLLQNSVLAEDGEVRSSCRT 
APTDLVFELDGSYSVGPENFEIVKKWLVNITKNF 
VDIGPiKFiQVGVVQYSDYPVLEIPLGSYDSGEHLTA 
AVESILYLGGNTKTGKAIQFALDYLFAKSSRFLT 
KIAWLTDGKSQDDVKDAAQAARDSKITLFAIG 
VGSETEDAELRAIANKPSSTYVFYVEDYIAISKIR 
EVMKQKLCEESVCPTRIPVAARDERGFDILLGLD 
VNKKVKJmQLSPKKnCGYEVTSKVDLSELTSNV 
FPEGLPPSYVFVSTQRFKVKBOWDLWRILTIDG/* 
PQIAVTLNGVDKILLFTTTSVINGSQVVTFANPQV 
KTLFDEGWHQIRLLVTEQDVTLYIDDQQIENKPL 
HPVLGILINGQTQIGKYSGKEETVQFDVQKLRIY 
CDPEQJWRETACEIPGFCLNGPSDVGSTPAPCICP 
PGKPGLQGPKGDPGLPGNPGYPGQPGQDGKPVS 
TESLVISGISGITGYQGIAGTPGVPGSPGIQGARGL. 
PGYKGEPGRDGDK 


3741 


A 


5048 


1236 


MSAPAGSSHPAASARIPPKFGGSAVSGAAAPAGP 

GAGPAPHQQNGPAQNQMQVPSGYGLHHQNYIA 

PSGHYSQGPGKMTSLPLDTQCGDYYSALYTVPT 

QNVTPNTVNQQPGAQQLYSRGPPAPHIVGSTLGS 

FQGAASSASHLHTSASQPYSSFVNHYNSPAMYS 

ASSSVASQGFPSTCGHYAMSTVSNAAYPSVSYPS 

LPAGDTYGQMFTSQNAPTVRPVKDNSFSGQNTA 

ISHPSPLPPLPSQQHHQQQSLSGYSTLTWSSPGLP 

STQDNLIRNHTGSLAVANNNPTITVADSLSCPVM 

QNVQPPKSSPWSTVLSGSSGSSSTRTPPTANHPV 

EPVTSVTQPSELLQQKGVQYGEYVNNQASSAPT 

PLSSTSDDEEEEEEDEEAGVDSSSTTSSASPMPNS 

YDALEGGSYPDMLSSSASSPAPDPAPEPDPASAP 

APASAPAPVVPQPSKMAKPLAMAIQHFSLVIRML 

QHHLFLEYSPSNPVYSGFQQYPQQYPGVNQLSSS 
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PCTAJSOl/04098 
ane, D=Aspartic Acid, 



SEQID I 
NO: 


Sfetbod J 

1 

] 


Predicted J 

t)eginning 

nucleotide 

ocation 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end i 
nucleotide 
ocation 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


\mino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
17— r>i..*aniii* Ai<iH FsPhenvlalanine. G=Glycine, H^HistSdine, 
l=Isoleudne, K=Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=Prollne, Q=Glutomine, R=Arginine, S=Seniie, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosiiie, 
X=UnknowB, *=Stop codon, possible nucleotide deletion, 
^possible nncleotide insertion 










IGGLSLQSSPQPESLRPVNLTQERNILPMTPVWAP 

VPNLNADLKKLNCSPDSFRCTLTNIPQTQALLNK 

AKLPLGLLLHPFRDLTQLPVrrSNnVRCRSCRTYI 

NPWSFIDQRR*KCNLCYRVNDVPEEFMY>JPLT 

RSYGEPHKRPEVQNSVTVEFIASSDYMLRPPQPAV 

YLFVLDVSHNAVEAGYLTI/LWCQSLLE\NLDKLP 

GVDSRT\MGFMm>\STYSFLQFTQEGLSQPQNlLl 

VSDIDDVFLPTPDSLLVNLYESKELIKDLO*ALPN 

MFTNTRETHSALGPALQAAFKLMSPTGGRVSVF 

QTQLPSLGAGLLQSREDPNQRSSTKVVQHLGPAT 

DFYKKLALDCSGQQTAVDLFLLSSQYSDLASLA 

CMSKYSAGCIYYYPSFHYTHNPSQAEKLQKDLK 

RYLTRKIGFEAVMRIRCTKGLSMHTFHGNFFVRS 

TDLLSLANINPDAGFAVQLSffiESLTOTSLVCFQT 

ALLYTSSKGERRIRVHTLCLPWSSLSDVYAGVD 

VQAAICLLANMAVDRSVSSSLSDARDALVNAW 

DSLSAYGSTVSNLQHSALMAPSSLKLFPLYVLAL 

LKQKAFRTGTSTRLDDRVYAMCQIKSQPLVHLM 

KMIHFNLYRIDRLTDEGAVHVNDRIVPQPPLQKL 

SAEKLTREGAFLMDCGSVFYIWVGKGCDKNFIE 

DVLGYTNFASIPQKMTHLPELDTLSSERARSFIT 

WLRDSRPLSPILHIVKDESPAKAEFFQHLIEDRTE 

AAFSYYEFLLHVQQQICK 


3742 


A 


934 


68 


SMLASQGVLLHPYGVPMlVPAAPYLPGUgUNC^b 

AAAAPDTMAQPYASAQFAPPQNGIPAEYTAPHP 

HPAPEYTGQTTVPEHTLNLYPPAQTHSEQSPADT 

SAQTVSGTRNKQD*RSTDGWPSPKTQTS*KHGK 

QVSSPSGLHVSNIPFR\FRDPDLRQMF\GQFGKILD 

VEIIFNERGSKGFGFVTFENSADADRAREKNLHGT 

VV\EGRKrEVNVNATAR\TVITNKKTVNPYTNGWK 

LNPWGAVYSPEFYAGTVLLCQANQEGSSMYSA 

PSTDFRGAKLHTSRPLLSGS 


3743 


A 


3 


1456 


■ QFQQAWMQNKVPffAPNEVLNDRKBDiKLEEKK 
KTQAEIEQEMATLQYTNPQLLEQLKffiRLAQKQV 
EQIQPPPSSGTPLLGPQPFPGQGPMSQIPQGF/PTA 
PSISADANEHGS\KGPPGPQGQFRPPGPQGQMGP 
QGPPLHQGGGGPQGFMGPQGPQGPPQGLPRPQD 
MHGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQG 
HMGPQGPPGPQGfflGPQGPPGPQGHLGPQGPPGT 
QGMQGPPGPRGMQGPPHPHGIQGGPGSQGIQGP 
VSQGPLMGLNPKGMQGPPGPRENQGPAPQGMI 
MGHPPQEMRGPHPPGGLLGHGPQEMRGPQEIRG 
MQGPPPQGSMLGPPQELRGPPGSQSQQGPPQGSL 
GPPPQGGMQGPPGPQGQQNPARGPHPSQGPPFQ 
QQKTPLLGDGPRAPFNQEGQSTGPPPLIPGLGQQ 
GAQGRIPPLNPGQGPGPNKVS/ERGAPPKHEGRA 
PPRGRDGFPGPMKTLV 


3744 


A 


1571 


652 


" PLTGRKCPGWTHSGSRRSPRIAEEVPGi-PKKAJiA 
SROFSETADRLELLRRAVMAAARATTPADGEEP 
An^AT?AT & A AnTJn Q^HTTT SGLELVKOGAEARVFR 

APEAtl»AJ.»AAAISJilxOoJSJJ-»^VJ^^-l-' V rwv^^-<'TJ--f . * *v 

grfqgr!\avikhrfpkgyrhpalearlgrrrtv 
oearallrcrragisapwffvdyasnclymeei 
egsvtvrdmfsplwrlkktpqglsnlaktigqvl 
armhdedlihgdlttsnmllkppleqlnivlidf 

GLSFISALPEDKGVDLYVLEKAFLSTHPNTETVFE 
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SEQU) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
correspon di ng 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A=Alanine C=Cy5teine, I>=Aspartic Add^ 
E>=Glutaniic Acid, F=Phenylalanine, G^Glycine, H-Histidine» 
I-Isoleucine, K=Lysine, ir=Leudne, M-Methionine, 
N^AsparaginC) P==Proline, Q^^^^GIutaniine, R^ArgininC) S^^Serine^ 
T«Threonine, V=VaIine, W=Tryptophan, V=Tyrosinc, 
X=Unknown, *»Stop codon, /^possible nucleotide deletion, 
V»possible nudeotide insertion 










AFLKSYSTSSKKARPVLKKLDE\nEU.RGKKRSMV 
G 


3745 


A 


127 


1433 


GSHRFSLASPLDPEVGPYCDTPTMRTLFNLLWLA 

LACSPVHTTLSKSDAKKAASKTLLEKSQFSDKPV 

QDRGLWTDLKAESWLEHRSYCSAKARDRHFA 

GPVLGYVTPWNSHGYDVTKVFGSKFTQISPVWL 

QLKRRGREMFEVTGLHDVDQGWMEIAVRKHAK 

GL\P*CLGSCLRTGLTM1SG/YVLDSEDEIEELSKT 

WQVAKNQHFDGFWEVWNQLLSQKRVGLIHM 

T THT APAT WOART T AT 1 VTPPATTPrtTnnT r^\AVT 

HKEFEQLAPVLDGFSLMTYDYSTAHQPGPNAPL 

SWVRACVQVLDPKSKWRSKILLGLNFYGMDYA 

TSKDAREPWGARYIQTLKDHRPRMVWDSQVSE 

HFFEYKKSRSGRHWFYPTLKSLQVRLELARELG 

VGVSIWELGQGLDYFYDLL*VGIAASAVDVFFSK 

PWSE 


3746 


A 


1 


898 


IDRAAECRTKPLPMAVSIRGNADSIVACLVLMVL 
YLIKJKJa.VACAAWYGFAVHMKrYPETYILPITL 

xjT T pF>Tirk>j'ni<rQT prvFRvnyrvAPt *1hT t i^pt r'xroT 
Jll-rljr^JL/JNl-^XNJjJvoJjiv^^rivi irl^A^Jj ilLfijJSJKX/UJNJtvl 

ALMFVAVAGLTFFALSFGFYYEYGWEFLEHTYF 

YHLTRRDIRHNFSPYFYMLYLTAESKWSFSLGIA 

AFLPQLILLSAVSFAYYRDLVFCWFLHTSIFVTFN 

KVCTSQYFLWYLCLLPLVMPLVRMPWKRAWL 

LMLWFIGQAMWLAPAYVLEFQGKNTFLFIWLA 

GLFFLLINCSILIQDSHYKEEPLTERIKYD 


3747 


A . 


1 


2325 


MVISFQGLVTFGDVAVDFSQEEWEWLNPIQRNL 

VRKWLLENVRNLASliGLGV^^ 

WTVKRKMTRAWCPDLKAVWKIKELPLKKDFCE 

GKLSQAVITERLTSYNLEYSLLGEHWDYDALFET 

QPGLVTIKNLAVDFRQQLHPAQKNFCKNGIWEN 

NSDLGSAGHCVAKPDLVSLLEQEKEPWMVKREL 

TGSLFSGQRSVHETQELFPKQDSYAEGVTDRTSN 

TKLDCSSFRENWDSDYVFGRKLAVGQETQFRQE 

PITHNKTLSKERERTYNKSGRWFYLDDSEEKVH 

NRDSlKNFQKSSVVIKQTGri^AGKKLFKaNlECKK 

TFTQSSSLTVHQRIHTGEKPYKCNECGKAFSDGS 

SFARHQRCHTGKKPYECmCGKAFIQNTSLIRHW 

RYYHTGEKPFDCIDCGKAFSDfflGLNQHRRIHTG 

EKPYKCDVCHKSF\RYGSSLTVHQRIHTGEKPYE 

CDVCRKAFSHHASL'nQ\HQRVHSGEKPFKCKEC 

GKAFRQNIHLASHLRIHTGEKPFECAECGKSFSIS 

SQLATHQRIHTGEKPYECKVCSKAFTQKAHLAQ 

HOl^TirrnFTf PVFrifFrnTfAl?^OTTT4T TOHnPVH 
n\^j\. 1 n X vjjDJnJT X C0\^r%Ji\^\jAJ\r o\^k x xii^iv^xiv^ix v n 

TGEKPYKCMECGKAFGDNSSCTQHQRLHTGQRP 

YECIECGKAFKTKSSLICHRRSHTGEKPYECSVC 

GKAFSHRQSLSVHQRIHSGKKPYECKECRKTFliQI 

GHLNQHKRVHTGERSYNYKkSRKVFRQTAHLA 

HHQRIHTGESSTCPSLPSTSNPVDLFPKFLWNPSS 

LPSP 


3748 


A 


823 


1 


GGYTKSGYDSACKDFVPHDLEVQIPGRVFLVTG 

GNSGIGKATALEIAKRGGTVHLVCRDQAPAEDA 

RGEIIRE\SGNQNIFLHIVDLSDPKKIWKFVENFKQ 

EHKLHVUVhmAGCMVNKREAHKKMDFEKNFG 

CQYSGVCTFLTTRPDPLCWRKNTDPRVmVSSG 

Gl^VQKLIWQ*SP\aiKNTIWMGTMVYAQ]^ 
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SEQID 1 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residne of 
peptide 
sequence 


\mino acid sequence (A=Alanine OCysteinc, D=Aspartic Acid, 

AriH F=:Phenvlfllanine. G^lycinCi H'^Histidinet 
l=Isoleucine, K^Lysine, L-Leudne, M"^ethlonine, 
N=Asparagine, P=Proiine, Q=GIutamine, R-AiBwine, S^Senne, 
T=Thrconinc, V=Valine, W-Tryptophan, Y^Tyrosinc, 

-m ^ m. «h C4A^_ AM J MM i^nne^WlA iillf^lf^^nllA nPif^nAII^ 

X^UnknoWlIt ^=Slop cmoDs /^possiDie nuacouoe acicuuu^ 
^''possible DDcleofide insertion 










ERQQVVLTfiRWGPRAPG^lHFSSMHPGWAXDTPU 
VRQAMPGFHVQASGYRLRSEAQGADTMLWLAL 

S S ARSRTAQRP 


3749 


A 


1939 


715 


GFLRLSQAT\RQRLSffVMVLlXDPTRD\QCFGDR 

FSRLLLDEFLGYDDE.\MSSVKGLAE>JEENKGFLR 

NVVSGEHYRFV\SMWMART^YLAAFANHGQSF 

TLSVSHACCGYSHHQIFVFIVDLLQMLEMNMAIA 

FPAAPLLTVILALVGMEAIMSEFFNDnTAFYIILI 

VWLADQYDAICCHTSTSKRHWLRFFYLYHFAFY 

AYHYRFNGQYSSLALVTSWLHQHSMIYFFHHYE 

LPAILQHVRIQ\EMLLQAPTLGPGTPTA\LPDDMN 

NNSGAPATAPVDSAGQPPALGPVSPGASGSPGPV 

AAAPSSLVAAAASVAAAAGGDLGWMAETAAUT 

DASFLSGLSASLLERKPASPLGPAGGLPHAPQDS 

VPPSDSAASDTIPLGAAVQGPSPASMAPTEAPSE 

VGS 


3750 


A 


2 


844 


GLUEPFSKLLSFVIQNAVFrLAYLVELCGLCYKA 

FIKERDKFYLSRSWLELLQALKLKSPLPDTNLL 

LLVQFICADAGTKLAESTILSKQNOASVPGCGTA 

AMECWQYINEVLDFMVADMHTL'IKLKSHMKTC 

SQPLHEDTFGGHLKVGLAQIAAMDISRGNHRDN 

KAVIRYLPWLYHPPSAMQQGPKEFIECVSHIRLL 

SWLLLGSLTHNAVCVLKWPPLPGLPPLDAGSHV 

APHLIVILIGFPEQSKTSVL\HMCSLFHAF\SLAQL 

WDSLLARQSGRW 


3751 


A • 


431 


2 


AFTRKCEETAFIVPQCEIlP'lWWVCRRIFrosSLtiK 
NPGVKEGCEFCPPKVEMFFKDDANHDPQWSRQ 
QLIAAKFGFAALGI/QTEVDIMSHAT* AVFEIPEKS 
RL\PQNCTPVDMKIEFGVHVTSKEILTDVIDNDS* 

RHSPS 


3752 


A 


131 


1278 


AWSGSGLLVLCINTASMPMISVLGKMl<LWyKliO 

PGGRWTCQTSRRVSSDPAWAVEWIELPRGLSLSS 

LGSARTLRGWSRSSRPSSVDSQDLPEVNVGDTV 

AMLPKSRRALTIQEIAALARSSLHGISQWKDHV 

TKPTAMAQGRVAHLffiWKGWSKPSDSPAALESA 

FSSYSDLSEGEQEARFAAGVAEQFAIAEAKLRA 

WSSVDGEDSTDDSYDEDFAGGMDTDMAGQLPL 

GPHLQDLFTGHRFSRPVRQGSVEPESDCSQTVSP 

DTLCSSLCSLEDGLLGSPARLAVPSCWAMSCFSPN 

CPPAGKVPSAAW/APLEAQDSLYNSPLTESCLSP 

AEBEPAPCKDCQPLCPPLTGSWERQRQASDLASS 

GWSLDEDEAEPEEQ 


3753 
3754 


A 
1 A 


3 
2 


1138 
3338 


" "YYSSVRQRVTC^RFRECAAALIEGSA IKV Y AU 
EWRADRRSGFGVSQRSNGLRYEGEWLGNRRHG 
YGRTrePDGSREEGKYKRNRLVHGGRVRSLLPL 
ALRRGKVKEKVDRAVEGARRAVSAARQRQEIA 
AARAADALLKAVAASSVAEKAVEAARMAKLIA 
QDLQPMLEAPGRRPRQDSEGSDTEPLDEDSPGV 
YENGLTPSEGSPELPSSPASSRQPWRPPACRSPLP 
Tir^/^T\rfcr^-Di7CCPi? A WPPFWriGAGAOAEELAGYE 
AEDEAGMQGPGPRDGSPLLGGCSDSSGSLREEE 
GEDEEPLPPLRAPAGTEPEPIAMLVLRGSSSRGPD 
AGCLTEELGEPAATERPAQPGAANPLWGAVAL 

LDLSLAFLFSQLLT 
" SSLLEJCMTSSDKDFRFMATSDLMSELQKJDi>iyi.D 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C=Cysteine, D^Aspartic Acid, 
J&=Glutamic Add, F^^Pbenylalanine, G=Glycine, HHSistidiDe, 
I==Isoleucine, K=Lysinc, L=Leudne, M=:Methionine, 
N=Asparagine, P=Pro!ine, Q=Giutamine, R^Arginine, S==Serine, 
T=Threonine, V=Valine, W-Tryptophan, Y=Tyrosinc 
X»Unknown, *^=Stop codon, /^possible nucleotide deletion, 
\=possible nudeotide insertion 




s 






EDSERKVVKMLLRLLEDKNGEVQNLAVKWLGV 

PLGAFHASLLHCLLPQLSSPRLAVRKRAVGALGH 

LATACSTDLFVELADHLLDRLPGPRVPTSPTAIRT 

LIQCLGSVGRQAGHRLGAHLDRLVPLVEDFCNL 

DDDELRESCLQAFEAFLRKCPKEMGPHVPNVTS 

LCLQYIKHDPNYNYDSDEDEEQMETEDSEFSEQE 

SEDEYSDDDDMSWKVRRAAAKCIAALISSRPDL 

LPDFHCTLAPVLIRRFKEREENVKADVFTAYIVL 

LRQTRPPKGWLEAMEEPTQTGSNLHMLRGQVPL 

WKALQRQLKDRSVRARQGCFSLLTELAGVLPG 

SLAEHMPVLVSGIIFSLADRSSSSTIRMDALAFLQ 

GLLGTEPAEAFHPHLPBLLPPVMACVADSFYKIA 

AEALWLQELVRALWPLHRPRMLDPEPYVGEMS 

AVTLARLRATDLDQEVKERAISCMGHLVGHLGD 

RLGDDLEPTLLLLLDRLRNEITRLPAIKALTLVAV 

SPLQLDLQPILAEALHILASFLRiCNQRALRLATLA 

ALDALAQSQGLSLPPSAVQAVLAELPALVNESD 

MHVAQLAVDFLATVTQAQPASLVEVSGPVLSEL 

LRLLRSPLLPAGVLAAAEGFLQALVGTRPPCVDY 

AKLISLLTAPVYEQAVDGGPGLHKQVFHSLARC 

VAALSAACPQ\EAESTASRLVCDARSPHSSTGVK 

VLAFLSLAEVGQVAGPGHERELKAVLLEALGSPS 

EDVRAAASYALGRVGAGSLPDFLPFLLEQIEAEP 

RRQYLLLHSLKEALGAAQPDSLBCPYAEDIWALL 

FQRCEGAEEGTRGWAECIGKLVLVNPSFLLPRL 

RKOLAAGRPHTRSTN^TAVKPLISDOPHPIDPLLK 

. XXXVV^JL'4>x^VJXXJL XA X XXO X V X X V XV^ XjXlJX.f\^X X XX XXJL M.4t <XV 

r SFIAVHNKPSLVRDLLDDILPLLYQETKIRRDLIRE 
VEMGPFKHTVDDGLDVRKAAFECMYSLLESCLG 
QLDICEFLNHVEDGLKDHYDIRMLTFIMVARLAT 
LCPAPVLQRVDRLIEPLRATCTAKVKAGSVKQEF 
EKQDELKRSAMRAVAALLTIPEVGKSPIMADFSS 
QIRSNPELAALFESIQKDSTSAPSTDSMELS 


3755 


A 


2 


3338 


SSLLEKMTSSDKDFRFMATSDLMSELQKDSIQLD 

EDSERKWKMLLRLLEDKNGEVQNLAVKWLGV 

PLGAFHASLLHCLLPQLSSPRLAVRKRAVGALGH 

LATACSTDLFVELADHLLDRLPGPRVPTSPTAIRT 

LIQCLGSVGRQAGHRLGAHLDRLVPLVEDFCNL 

DDDELRESCLQAFEAFLRKCPKEMGPHVPNVTS 

LCLQYIKHDPNYNYDSDEDEEQMETEDSEFSEQE 

SEDEYSDDDDMSWKVRRAAAKCIAALISSRPDL 

LPDFHCTLAPVLBRRFKEREENVKADVFTAYIVL 

LRQTRPPKGWLEAMEEPTQTGSNLHMLRGQVPL 

WKALQRQLKDRSVRARQGCFSLLTELAGVLPG 

SLAEHMPVLVSGIIFSLADRSSSSTIRMDALAFLQ 

GLLGTEPAEAFHPHLPILLPPVMACVADSFYKIA 

AEALWLQELVRALWPLHRPRMLDPEPYVGEMS 

AVTLARLRATDLDQEVKERAISCMGHLVGHLGD 

RLGDDLEPTLLLLLDRLRNEITRLPAEKALTLVAV 

SPLQLDLQPILAEALHILASFLRKNQRALRLATLA 

ALDALAQSQGLSLPPSAVQAVLAELPALVNESD 

MHVAQLAVDFLATVTQAQPASLVEVSGPVLSEL 

LRLLRSPLLPAGVLAAAEGFLQALVGTRPPCVDY 

AKLISLLTAPVYEQAVDGGPGLHKQVFHSLARC 

VAALSAACPQ\EAESTASRLVCDARSPHSSTGVK 

VLAFLSLAEVGQVAGPGHERELKAVLLEALGSPS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence {A=Alanine C=<:ysteme, D=Aspartic Acid, 
EXxlutamic Acid, F=r nenyiaiamnc, i>— wycine, n— niauumc, 
I==IsoIeucinc K=Lysine, Ir^Leucine, M«Methionine, 
N=Asparagine, P^Proline, Q-Glutaminc, R«Arglnine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=*top codon, possible nucleotide deletion, 
\=pos5ible nucleotide insertion 










EDVRAAASYALGRVGAOSLPDFLPFLLEQUEAliP 

RRQYLLLHSLKEALGAAQPDSLKPYAEDIWALL 

FQRCEGAEEGTRGVVAECIGKLVLVNPSFLLPRL 

RKQLAAGRPHTRSTVITAVKFLISDQPHPIDPLLK 

SFIAVHNKPSLVRDLLDDILPLLYQETKIRRDLIRE 

VEMGPFKHTVDDGLDVRKAAFECMYSLLESCLG 

QLDICEFLNHVEDGLKDHYDIRMLTFIMVARLAT 

LCPAPVLQRVDRLIEPLRATCTAKVKAGSVKQEF 

EKQDELKRSAMRAVAALLTIPEVGKSPIMADFSS 

OIRSNPELAALFESIQKDSTSAPSTDSMELS 


3756 


A 


112 


1361 


SLEEQQGRHPSFAPKCASQILGRIMITLITEQLQK 

QTLDELKCTRFSISLPLPDHADISNCGNSFQLVSE 

GASWRGLPHCSCAEFQ/DQPQLQLPSLRPEPAPQ 

TTWGNSPKEQPFSQVLRPEPPDPEKLPVPPAPPS 

KRHCRSLSVPVDLSRWQPVWRPAPSKLWTPIKH 

RGSGGGGGPQVPHQSPPKRVSSL/SVPPSSQCLFS 

MCPSSHTLQPSFLQPGPGP\DSSRPCAASPQSGSW 

ESDAESLSPCPPQRRFSLSPSLGPQASRFLPSARSS 

PASSPELPWRPRGLRNLPRSRSQPCDLDARKTGV 

KRRHEEDPRRLRPSLDFDKMNQKPYSGGLCLQE 

TAREGSSISPPWFMACSPPPLSASCSPTGGSSQVL 

SESEEEEEGAVRWGRQALSKRTLCQRDFGDLDL 

NLIEEN 


3757 


A 


413 


1 


PKPMLQQDFT/SLPDQGLDH1AE/NSYFDAKS>LCA 
AELVCKEWQQVTSE*MLWKKLIERMVHAYPLW 
KGLSEKVW/DQJDLFKNRPTDGPPNSFHRSLYPKn 
QV1ETIESNWQCG*UTLQRIQCHSEKSKGVYCLQ 
YDDEK 


37j<$ 


A 

A 


9 

it 


613 


FVSGSPWRMDGSTERLEARRPAGRLPWSSRQEM 

TRRPSLMAGRQHGWSAQQSATVANPVPGANPD 

LLPHFLGEPEDVYIVKNKPVLLVCKAVPATQIFF 

KCNGEWVRQVDHVIERSTOGSSGLPTMEVRINV 

SRQQVEKVFGLEEYWCQCVAWSSSGTTKSQKA 

YIRIAYLRKOTEQEPLAKEVSLEQGIVLPCRPPEGI 

PPAE 


3759 


A 


1 


561 


ADDTLHLWNLRQKRPAILHSLKFCRERV IbCHl.l:' 

FQSKWLYVGTERGNIHIVNVESFTLSGYVIMWN 

KAIELSSKSHPGPWfflSDNPMDEGKLLIGFESGT 

VVLWDLKSKKADYRYTYDEAfflSVAWHHEGKQ 

FICSHSDGTLTIWNVRSPAKPVQTITPHGKQLICD 

GKKPEPCKPDLKVEFXTTR 


3760 


A 


1 


824 


LPACRCGCVAGCPSNHGICRCLRASERQVCVMH 

LKHLRTLLSPQDGAAKVTCMAWSQNNAKFAVC 

TVDRWLLYDEHGERRDKFSTKPADMKYGRKS . 

YMVKGMAFSPDSTKIAIGQTDNIIYVYKIGEDWG 

DKKVICNKFIQTVKFRPVPGTLG*TNIYQYIYL*IQ 

PGVAFLTSECDFSYCKDGASWLFMV1CCLP*SPA 

VSFPIGD*\SAVTCLQWPAEYnVFGLAEGKVRLS 

NTKTNKSSHYGTESYVVSLTTNCSGKGILSGHA 

DGYQR 


3761 


A 


2253 


320 


PVIQRCSQPYGFSLLISFFLKCVSETSQQPPSKKVh 
QLLPSFPTLTRSKSHESQLGNRIDDVSSMRFDLSH 
GSPQMVRRDIGLSVTHRFSTKSWLSQVCHVCQK 
SMIFGVKCKHCRLKCHNKCTKEAPACRISFLPLT 
RLRRTESVPSDINNPVDRAAEPHFGTLPKALTKK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^^Alanine C=Cysteine, D=Aspartic Acid, 
£=GIutamic Acid, F=Phenylalanine, (^Glycine, H=Histiditte, 
I=Isoleucjne, K-LysiBe, L^Leucine, M-Metliiooijie, 
N=Asparagine, P=Proline, Q=Glutamine, R=ArgiaiDe, S=Scrine, 
T-Threonine, V=Valine, W=Tryptophan, Y=Tyroslne, 
X==Unknown, *=Stop codon, /possible nucleotide deletion, 
V=possible nucleotide insertion 










EHPPAMNHLDSSSNPSSTTFSTPSSPAPFPTSSNPS 

SATTPP\NPSP\GQR\DSRFNFPSC/AYFIHHR\Q\QFI 

FPDISAFAHAAPLPEAADGTRLDDQPKADVLEAH 

EAEAEEPEAGKSEAEDDEDEVDDLPSSRRPWRG 

PISRKASQTSVYLQEWDIPFEQVELGEPIGQGRW 

GRVHRGRWHGEVAIRLLEMDGHNQDHLKLFKK 

EVMNYRQTRHENWLFMGACMNPPHLAnXSFC 

KGRTLHSFVRDPKTSLDINKTRQIAQEIIKGMGYL 

GWPMEGRRENQLKLSHDWLCYLAPEJVRENCTPG 

KDEDQLPFSKAADVYAFGTVWYELQARDWPLK 

NQAAEASIWQIGSGEGMKRVLTSVSLGKEVSEN 

LSACWAFDLQERPSXFSLLMDMLEKLPKLNRRLS 

HPGHF*KSADINSSKVVPRFERFGLGVLESSNPK 

M 


3762 


A 


2 


1578 


MAHYITFLCMVLVLLLQNSVLAEDGEVRSSCRT 

APTDLVFILDGSYSVGPENFEIVKKWLVNITKNF 

DIGPKFIQVGVVQYSDYPVLEIPLGSYDSGEHLTA 

AVESILYLGGNTKTGKAIQFALDYLFAKSSRFLT 

KIAWLTDGKSQDDVKDAAQAARDSKTTLFAIG 

VGSETEDAELRAIANKPSSTYVFYVEDYIAISKIR 

EVMKQKLCEESVCPTRLPVAARDERGFDILLGLD 

VNKKVKKmQLSPKKlKGYEVTSKVDLSELTSNV 

FPEGLPPSYWVSTQPIFKVKKIWDLWRILTIDG/* 

POTAVTT TsinvnkTf T FTTTWrMrr^^nWTPA'MPnV 

x\^ij\ V i JUIN^J V J^JVLLjJUr JL 1 1 kj V ilN>Ji3V^ V V 1 TAiNx V 

KTLFDEGWHQIRLLVTEQDVTLYIDDQQIENKPL . 
%PVLGmmGQTQiGkYSGKEEWQFDVQKLRIY i 
CDPEQNNRETACEIPGFCLNGPSDVGSTPAPCICP 
PGKPGLQGPKGDPGLPGNPGYPGQPGQDGKPVS 
TESLVISGISGITGYQGIAGTPGVPGSPGIQGARGL 
PGYKGEPGRDGDK 


3763 


A 


3 


1267 


CKVWRNPLNLFRGAEYNRYTWVTGREPLTYYD 

MNLSAQDHQTFFTCDSDHLRPADAIMQKAWRE 

RNPQARISAAHEALEINECATAYILLAEEEATTIA 

EAEKLFKQALKAGDGCYRRSQQLQHHGSQYEA 

QHSVLYLPLQ\TRHQCLGVHQKKASNVCQKTRE 

DQGSSENDERFNEGVPPSEYVQYP*KPRKALLEL 

OAYADVOAVLAKYDDISLPKSATICYTAALLKA 

RAVSDKFSPEAASRRGLSTAEMNAVEAIHRAVEF 

NPHVPKYLLEMKSLILPPEHILKRGDSEAIAYAFF 

HLAHWKRVEGALNLLHCTWEGTFRMIPYPLEKG 

HLFYPYPICTETADRELLPSFHEVSVYPKKELPFFI 

LFTAGLCSFTAMLALLTHQFPELMGVFAKAVSV 

CLEGGLGEWMGKAKGDCAA 


3764 


A 


25 


1032 


RSADGLCGNKDRERGNEFTRNQQAAQEVVNPK 
KKMKKKKYVNSGTVTLLSFAVESECTFLDYIKG 
GTQINFTVABDFTASNGNPSQSTSLHYMSPYQLN 
AYALALTAVGEITOHYnSDKMFPALGFGAKJ ppr) 

GRVSHEFPLNGNQENPSCCGIDGILEAYHRSLRT 

VQLYGPTNFAPWTHVARNAAAVQDGSQYSVL 

LnTDGVISDMAQTBCEATVNGVSKLPMSmVGVGQ 

AEFNAMVELDGDDVRISSRGKLAERDIVQFVPFR 

DYVDRTGNHVLSMARLARDVLAEIPDQLVSYM 

KAQGIRPRSPPAAPTHSPSQSPARTPPACPLHTHI 


3765 


A 


172 


3456 


LGMMDSPKIGNGLPVIGPGTDIGISSLHMVGYLG 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D-Aspartic Acid, 
X -tjM xssphpnvlnlsninc CsGlvcine. H— Histidine. 

E^^lUtamiC ACIG> i»'=lrncliyi<i»« \j— xbijtwm*., uiauuiH^y 

I=Isolencine, K=Lysiiie, L=Leucine, M°Metliioiiine, 
N=Asparagine, PaPrdlne, Q-Glntiimliie, R-Arginine, S-Serine, 
T=ThreomDe, V=Vallne, W=TryptophaB, V=Tyrosine, 
X^Unknown, *-S»p codod, /^i^iJsslWe nucleoade dektton, 
\=posabIe nndeotide iosotion 










KNFDSAKWSDEYCPAClOiKGKLKALKTYRISFQ 

ESIFLCEDLQCIYPLGSKSLNNLISPDLEECHTPHK 

PQKRKSLESSYKDSLLLANSKKTRNYIAIDGGKV 

LNSKHNGEVYDETSSNLPDSSGQQNPIRTADSLE 

RNEILEADTVDMATTKDPATVDVSGTGRPSPQN 

EGCHSKLEMPLESKCTSFPQALCVQWKNAYALC 

WLDOLSALVHSEEUCNTVTGLCSKEESIFWRLL 

TKYNQANTLLYTSQLSGVKDGDCKKLTSEIFAEI 

ETCLNEVRDEIFISLQPQLRCTLGDMESPVFAFPL 

LLKLETHIEKLFLYSFSWDFECSQCGHQYQNRH 

MKSLVTFTNVTPEWHPLNAAHFGPCNNCNSKSQI 

RKMVLEKVSPIFMLHFVEGLPQhIDLQHYAFHFE 

GCLYQITSVIQYRANNHFITWILDADGSWLECDD 

LKGPCSERHKKFEVPASEIHIVIWERKISQVTDKE 

AACLPLKKTNDQHALSNEKPVSLTSCSVGDAAS 

AETASVTHPKDISVAPRTLSQDTAVTHGDHLLSG 

PKGLVDNE.PLTLEETIQKTASVSQLNSEAFL\LEN 

KPVAENTGILKTNTLLSQESLMASSVSAPCNEKLI 

QDQFVDISFPSQWNTNMQSVQLNTEDTVNTKS 

VNNTDATGLIQGVKSVEIEKDAQLKQFLTPKTEQ 

LKPERVTSQVSNLKKKETTADSQTTTSKSLQNQS 

LKENQKKPFVGSWVKGLISRGASFMPLCVSAHN 

RNTITDLQPSVKGVNNFGGFKTKGINQKASHVSK 

KARKSASKPPPISKPPAGPPSSNGTAAHPHAHAA 

SEVLEKSGSTSCGAQL>fHSSYGNGISSANHEDLV 

EGQIHKLRLKLRKKLKAEKKKLAALMSSPQSRT 

VRSENLEQVPQDGSPNDCESIEDLLNELPYPIDIA 

NESACTTVPGVSLYSSQTHEEILAELLSPTPVSTE 

LSENGEGDFRYLGMGDSfflPPPVPSEFNDVSQNT 

HLRQDHNYCSPTKKNPCEVQPDSLTNNACVRTL 

NLESPMKTOIFDEFFSSSALNALANDTLDLPHFDE 

YLFENY 


3766 


A 


3 


1622 


AQQnrmr^TVILENYKNLVSLGYQLTKPDmRL 

EKGEEPWLVEREIHQETHPDSETAFEIKSSVSSRSI 

FKDKQSCDIKMEGMARI«IDLWYLSLEEVWKCRD 

QLDKYQENPERHLRQVAFTQKKVLTQERVSESG 

KYGGNCLLPAQLVLREYFHKRDSHTKSLKHDLV 

LNGHQDSCASNSNECGQTFCQNIHLIQFARTHTG 

DKSYKCPDNDNSLTHGSSLGISKGIHREKPYECK 

ECGKFFSWRSNLTRHQLIHTGEKPYECKECGKSF 

SRSSHLIGHQKTHTGEEPYECKECGKSFSWFSHL 

VTHQRTOTGDKLYTCNQCGKSFAOISSRLIRHQR 

THTGEKPYECPECGKSFRQSTHLILHQR'raVRVR 

PYECNECGKSYSQRSHLWHHRIHTGLKPFECKD 

CGKCFSRSSHLYSHQRTHTGEKPYECHDCGKSFS 

QSSALIVHQRIHTGEKPYECCQCGKAFIRKNDLIK 

HQRJHVGEETYKCNQCGIIFSQNSPFIVHQIAHTG 

EOFLTCNOCGTALVNTSNLIGYOTNHIRENAY 


3767 


A 


3 


1622 


AQQIVYRNVMLENYKKLVSLGYQLTKPUVILKL 

EKGEBPWLVEREIHQETHPDSETAr 1ijUs.5>:> v 

FKDKQSCDKMEGMARNDLWYLSLEEVWKCRD 

QLDKYQENPERHLRQVAFTQKKVLTQERVSESG 

KYGGNCLLPAQLVLREYFHKRDSHTKSLKHDLV 

LNGHQDSCASNSNECGQTFCQNIHLIQFARTHTG 

DKSYKCPDNDNSLTHGSSLGISKGIHREKPYECK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nudeotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
seqoence 


Amino add sequence (A=Alanine C==Cystdne, I>=Aspartic Add, 
E=Glutamic Acid, F^Phenylalanine, <>=Glydnc, H-Histidine, 
I^Isoleucine, K^Lysine, L^Leudne^ M«=Methionine, 
N=Asparagine,P^Proline, Q==G]utamine, R=Arginine, S=Serine, 
T«Threoninc, V«Valine, W=Tryptopbao, Y^Tyrosinc, 
X^llnknown, ^'^Stop codon, possible nudeotide deletion, 
^possible nudeotide insertion 










ECGKFFSWRSNLTRHQLIHTGEKPYECKECGKSF 

SRSSHLIGHQKTHTGEEPYECKECGKSFSWFSHL 

VTHQRTHTGDKLYTCNQCGKSF/VHSSRLIRHQR 

THTGEKPYECPECGKSFRQSTHLILHQRTHVRVR 

FYECNECGKSYSQRSHLVVHHRIHTGLKPFECKD 

CGKCFSRSSHLYSHQRTHTGEKPYECHDCGKSFS 

QSSALIVHQRIHTGEKPYECCQCGKAFIRKNDLIK 

HQRIHVGEETYKCNQCGnFSQNSPFIVHQIAHTG 

EQFLTCNQCGTALVNTSNLIGYQTNHIRENAY 


3768 


A 


185 


2258 


SinKMSRKISKESKKVNISSSLESEDISLETTVPTD 

DISSSEEREGKVRTTRQLIERKELLHNIQLLKIELS 

QKTMMTONLKXHDYLTKIEELEEKLNDALHQKQL 

LTLRLDNQLAFQQKDASKYQELMKQEMETILLR 

QKQLEETNLQLREKAGDVRRSLRDFELTEEQYIK 

LKAFPEDQLSIPEYVSVRFYELVNPLRKEICELQV 

KKNIIjyEELSTNKNQLKQLTETYEEDRKNYSEV 

QIRCQRLALELADTKQLIQQGDYRQENYDKVKS 

ERDALEQEVBELRRKHEILEASHMIQTKERSELSK 

EVVTLEQTVTLLQKDKEYLNRQNMELSVRCAHE 

EDRLERLQAQLEESKKAREEMYEKYVASRDHY 

KTEYENKLHDELEQIRLKTNQEIDQLRNASREMY 

ERENRNLREARDNAVAEKERAVMAEKDALEKH 

DQLLDRYRE\LQ\LSTESKVTEFLHQSKLKSFESE 

RVQLLQEETARNLTQCQLECEKYQKKLEVLTKE 

FYSLQASSEKRITELQAQNSEHQARLDIYEKLEK 

ELDEIIMQTAEIENEDEAERVLFSYGYGANVPTT 

AKRRLKQSVHLiVsRRVLQLEKQNSLI/LKRSGTSK 

GPSNTAFTOSLTEANSLLNQTQQPYRYLEESVRQ 

RDSKIDSLTESIAQL/ERKDVSNLNKEKSALLQTN 

GIKMAL\DL\DQLLNHP 


3769 


A 


3 


2297 


DAAEFRVVADAMKVIGFKPEEIQTVYKILAAILH 

LGNLKFWDGDTPLIENGKVVSUAELLSTKTDM 

VEKALLYRTVATGRDIIDKQHTEQEASYGRDAF 

AKAIYERLFCWIVTRI>nDimVK>mDT^ 

IGVLDIYGFEIFDIWSFEQFCINYCNEKLQQLFIQL 

\a.KQEQEEYQREGIPWKHIDYFNNQIIVDLVEQQ 

HKGIIAILDDACMNVGKVTDEMFLEALNSKLGK 

HAHFSSRKLCASDKILEFDRDFRIRHYAGDVVYS 

VIGFIDKNKDTLFQDFKRLMYNSSNPVLKNMWP 

EGKLSITEVTKRPLTAATLFKNSMIALVDNLASK 

EPYYVRCIKPNDKKSPQIFDDERCRHQVEYLGLL 

ENVRVRRAGFAFRQTYEKFLHRYKMISEFTWPN 

HDLPSDKEAVKKLffiRCGFQDDVAYGKTKIFIRT 

PRTLFTLEELRAQMLIRIVLFLQKVWRGTLARMR 

YKRTKAALTIIRYYRRYKVKSYIHEVARRFHGVK 

TMRDYGKHVKWPSPPKVLRRFEEALQTBFNRWR 

ASQLIKSIPASDLPQVRAKVAAVEMLKGQRADL 

GLQRAWEGNYLASKPDTPQTSGTFVPVANELKR 

KDKYMNVLFSCHVRKVNRFSKVEDRAIFVTDRH 

LYKMDPTKQYKVMKTIPLYNLTGLSVSNGKDQL 

VVFHTKDNKDLIVCLFSKQPTHESRIGELWGVLV 

NHFKSEKRHLQWNVTNPVQCSLHGKKCTVSVE 

TRLNQPQPDFTKNRSGFILSVPGN 


3770 


A 


3 


6276 . 


HKVAAPDWVPTLDTVRHEALLYTWLAEHKPL 
VLCGPPGSGKTMTLFSALRALPDMEWGLNFSS 
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PCTAJSOl/04098 



I SEQID 
NO: 


[Method 1 i 


Predicted 

beginning 

aucleotide 

ocation 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 

r, M. A A J 1?— i>ii»nviaiantn# C^^lvclnc H— Histiiliilfc> 

b^Gluukmic Ada* r=rnenyi«iaiiine, v»— wijwui^ n uiwwauvt 

l=Isolcudne, K=Lysine, l^Leudne, M=Metliloiilne, 

N=Asparagin«, P=Prolint, Q=Glatainide, R»Argimiic S=SerUie, 

T=Threoiiine, V=VaUoe, W=Tryptophan, Y^^Iirosine, 

X=UdIuiowd, *=Stop codon, /=possible uDdeoiide ddetioB, 

^possible nodeotide insertion 










ATIPELLLKTFDHYCEYRRTPNGVVLAPVQLGK 

WLVLFCDEINU'DMDKYGTQRVISFIRQMVEHG 

GFYRTSDQTWVKLERIQFVQACNPPTDPGRKPLS 

HRFLRHVPVVYVDYPGPASLTQIYGTFNRAMLR 

LIPSLRTYAEPLTAAMVEFYTMSQERFTQDTQPH 

YIYSPREMTRWVRGIFEALRPLETLPVEGLIRIWA 

HEALRLFQDRLVEDEERRWTDENIDTVALKHFP 

N1DREKAMSRPD.YSNWLSKDYIPVDQEELRDYV 

KARLKVFYEEELDVPLVLFNEVLDHVLRIDRIFR 

QPQGHLLLIGVSGAGKTTLSRFVAWMNGLSVYQ 

IKVHRKYTGEDFDEDLRTVLRRSGCKNEKIAFIM 

DESNVLDSGFLERMNTLLANGEVPGLFEGDEYA 

TLMTQCKEGAQKEGLNDLDSHEELYKWFTSQVIR 

NLHVVrrMNPSSEGLKDRAATSPALFNRCVLNW 

FGDWSTEALYQVGKEFTSKMDLEKPNYTVPDYM 

PWYDKLPQPPSHREATVNSCVFVHQTLHQANA 

RLAKRGGRTMAITPRHYLDFINHYANLFHEKRSE 

LEEQQMHLIWGUUKIKETVDQVEELRRDLRIKS 

QELEVKNAAAhTOKLKKMVKDQQEAEKKKVMS 

QEIQEQLHKQQEVIADKQMSVKEDIJDKVEPAVI 

EAQNAVKSIKKQHLVEVRSMANPPAAVKLALES 

ICLLLGESTTDWQmsnMRENFIPTIVNFSAEEIS 

DAIREBMCKNYMSNPSYNYEIVNRASLACGPMV 

KWAIAQLNYADMLKRVEPLRNELQKLEDDAKD 

NQQKANEVEQMIRDLEASIARYKEEYAVLISEAQ 

AIKADLAAVEAKVNRSTALLKSLSAERERWEKT 

SETFKNQMSTIAGDCLLSAAFIAYAGYFDQQMR 

QNLFTTWSHHLQQANIQFRTDIARTEYLSNADER 

LRWQASSLPADDLCTENAIMLKRFNRYPLUDPS 

GQATEFIMNEYKDRKITRTSFLDDAFRKNLESAL 

RFGNPLLVQDVESYDPVLNPVLNREVRRTGGRV 

LITLGDQDEDLSPSFVIFLSTRDPTVEFPPDLCSRV 

TFVNFTVTRSSLQSQCLNEVLKAERPDVDEKRSD 

LLKLQGEFQLRLRQLEKSLLQALNEVKGRILDDD 

TIITTLENLKREAAEVTRKVEETDIVMQEVETVS 

QQYLPLSTACSSIYFTMESLKQIHFLYQYSLQFFL 

DIYHNVLYENPNLKGVTDHTQRLSnTKDLFQVA 

FNRVARGMLHQDHTTFAMLLARIKLKGTVGEPT 

YDAEFQHFLRGNEIVLSAGSTPRIQGLTVEQAEA 

VVRLSCLPAFKDLIAKVQADEQFGIWLDSSSPEQ 

TVPYLWSEETPATPIGQAIHRLLLIQAFRPDRLLA 

MAHMFVSTNLGESFMSIMEQPLDLTQIVGTEVKP 

NTPVLMCSVPGYDASGHVEDLAAEQNTQITSIAI 

GSAEGFNQADKAINTAVKSGRWVMLKNVHLAP 

GWLMQLEKKLHSLQPHACFRLELTMEINPKVPV 

NLLRAGRIFVFEPPPGVKANMIJRTFSSIPVSRICK. 

SPNERARLYFLLAWFHAHQmiLRYAPLGWSKKY 

EFGESDLRSACDTVDTWLDDTAKGRQNISPDKIP 

WSALKTLMAQSIYGGRVDNEFDQRLLNTFLERL 

T-wmowTNcm^vT A r<v\rnr^'PT^"nTnMPnGTRREEFV 
FTTRSFDSErKX*Auis.VJJoiijSsiJiv^^ ^ 

QWVELLPDTQTPSWLGLPNNAERVLLTTQGVD 

MISKMLKMQMLBDEDDLAYAETEKKTRTDSTS 

DGRP\AWMRTLHTTASNWLHLIPQTLSHLKRTVE 

NIKDPLFRFFE\REVKMGAKLLQ\DVRQDLADV\V 

OVCEGKKKOTOYLRTLKNELVVKGILPVRSWSHY 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
ondeotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D=:Aspartic Add, 
£=G]utaniic Acid, F=PhcnylaJanine, G=Glycinc, H^ITistidine, 
I=Isoleudne, K=Lysinc, Lr^Leucine, M=Methionine, 
N«Asparagine, P*=ProIine, Q=Glutamine, R=Arginine, S=Scrinc, 
l^Threonine, V=VaIine, W=Tryptoplian, Y=Tyrosine, 
X=Unknown, ^^Stop codon, A^possible nudeotide deletion, 
V=possible nucleotide insertion 










TWAG\MTVIQWGVPISARRI\KQLQNISL\AAASG 

GAKELKNIHVCLGGLFVPEAYITATRQYVAQAN 

SWSLEELCLEVNVTTSQGATLDACSFGVTGLKL 

QGATCNNNKLSLSNAISTALPLTQLRWVKQTNT 

EKKASWTLPVYLlOTRADLIFTVDFEUtKEDPR 

SFYERGVAVLCTE 


3771 


A 

*^ ' 


1 


2043 


LPLLHAGFMlRFMENSSnACYNELIQffiHGEVRS 

QFKLRACNSVFTALDHCHEAIEITSDDHVIQYVN 

PAFERMMGYHKGELLGKELADLPKSDKNRADL 

LDTINTCIKKGKEWQGVYYARRKSGDSIQQHVKI 

TPVIGQGGKIRHFVSLKKLCCTTDNNKQIHKIHR 

DSGDNSQTEPHSFRYKNRRKESIDVKSISSRGSDA 

PSLQNRRYPSMARIHSMTIEAPITKVINIINAAQEN 

SPVTVAEALDRVLEILRTTELYSPQLGTKDEDPH 

TSDLVGGLMTDGLRRLSGNEYVFTKNVHQSHSH 

LAMPITINDWPCISQlXD>ffiESWDFNIFEIJEAira 

KRPLVYLGLKVFSRFGVCEFLNCSETTLRAWFQ 

VIEANYHSSNAYHNSTHAADVLHATAFFLGKER 

VKGSLDQLDEVAALIAATVHDVDHPGRTNSFUC 

NAGSELAVLYND-AAVXLESHHTALAFQVLTVKDT 

KNCNIFKNID/RGNHYRTLRQAimMVLATEMTKH 

FEHVNKFVNSINKPMAAEIEGSDCECNPAGKNFP 

ENQILIKilMMIKCADVANPCRPLDLCrEWAGRIS 

EEYFAQTDEEKRQGLPVVMPVFDRNTCSIPKSQI 

SFIDYFITDMFDAWDAFAHLPALMQHLADNYKH 

WKTLDDLKCKSLRLFSDRLKPSHRGGLLTDKGH 


3772 


A 


1013 


50 


TLVHADGFPSLHITETCLAYREKRIGIDLVHDTVE 

HELIKEAEIIQGIMALLTOTLEEASEQIRMNRSAK 

YNLEKDUCDKFVALTIDDICFSLNNNSPNIRYSEN 

AVRIEPNSVSLEDWLDFSSTNVEKADKQKNNSL 

MLKALVDXRDLSQTANYLRKQCDWHTAFKNGL 

KDTKDARDQLADHLAKWMEEIASQEKNITALEK 

AILDQEGPAKVAHTRLETRTHRPNVELCRDVAQ 

YRLMKEVQEITHNVARLKETLA\QAQAELKGLH 

RRQLALQEEIQVKENTIYIDEVLCMQMRKSrPLR 

DGEDHGVWAGGLRPDAVC 


3773 


A 


1 


955 


AAARESERQLRLRLCVLNEILGTERDYVGTLRFL 

QSAFLHRIRQNVADSVEKGLTEENVKVLFSNIEDI 

LEVHKDFLAALEYCLHPEPQSQHELGNVFLKFK 

DKFCVYEEYCSNHEKALRLLVELNKIPTVRAFLL 

SCMLLGGRKTTDIPLEGYL\LSPIQRICKYPLLLKE 

LAKRTPGKHPDHPAVQ\SALQAMKTVCSNINETK 

RQMEKLEALEAAA/QSHffiGWEGSNLTDICTQLL 

LQGTLLKISAGNIQERAFFLFDNLLVYCKRKSRV 

TGSKKSTKRTKSINGSLYIFRGRINTEVMEVENVE 

DGTGSPSPSLA 


3774 


A 


4254 


2061 


ELQGDFSVPDVPKSMAWCENSICVGFKRDYYLI 

RVDGKGSDCELFPTGKQLEPLVAPLADGKVAVG 

QDDLTVVLNEEGICTQKCALNWTDIPVAMEHQP 

PYIIAVLPRYVEIRTFEPRLLVQSIELQRPRFITSGG 

SNIIYVASNHFVWRLIPVPMATQrQQLLQDKQFE 

LALQLAEMKDDSDSEKQQQIHHIKNLYAFNLFC 

QKRFDESMQVFAKLGTDPTHVMGLYPDLLPTDY 

RKQLQYPNPLPVLSGAELEKAHLALIDYLTQKRS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C^Cysteine, D^Aspartic Acid, 
ir-rititgniir Acid F^Pbenvlalanlnet G^Glycinei BNHistidine, 
I=^IsoIeudne, K=Lysine, LpLeucine, M^Methionlne, 
N"Asparagine, P«Proline, Q=Glutamine, R«Arginine, S^Serine, 
T«<rhrtonine, V==VaIine, W^Trypfoplian, Y^Tyrosine, 
X=Unknown, *=^top codon, /-possible nucleotide deletion, 
\ppossibIe nucleotide insertion 










QLVKKIM)SDHQSSTSPlJV[EGTPmSKK^ 

DTTLLKCYUnWALVAPLLRLENNHCHIEESEH 

VLKKAHKYSELIILYEKKGLHEKALQVLVDQSK 

KANSPLKGHERTVQYLQHLGTENLHLIFSYSVW 

VLRDFPEDGLKIFTEDLPEVESLPRDRVLGFLIEN 

FKGLAIPYLEHnHVWEETGSRFHNCLIQLYCEKV 

QGLMKEYLLSFPAGKTPVPAGEEEGELGEYRQK 

LLMFLEISSYYDPGRLICDFPFDGLLEERALLLGR 

MGKHEQALFIYVHILKDTRMAEEYCHKHYDRN 

KDGNKDVYLSLLRMYLSPPSIHCLGPIKLELLEPK 

ANLQAALQVLELHHSKLDTTKALNLLPANTQIN 

DnUFLEKVLEENAQKKRFNQVLKNLLHAEFLRV\ 

QEERILHQQVKCIITEEKVCMVCKKKIGNSAFAR 

YPNGWVHYFCS\KEVNPADT 


3775 


A 


1832 


839 


MSRARGALCRACLALAAALAALLLLPLPLPRAP 

APARTPAPAPRAPPSRPAAPSLRPDDVFIAVKTTR 

KNHGPRLRLLLRTWXISRARQQTFIFTDGDDPELE 

LQGGDRVINTNCSAVRTRQALCCKMSVEYDKFI 

ESGRKWFCHVDDDNYVNARSLLHLLSSFSPSQD 

VYLGRPSLDHPmATERVQGGRTVTTVKFWFAT 

GGAGFCLSRGLALKMSPWASLGSFMSTAEQVRL 

PDDCTVGYIVEGLLGARLLHSPLFHSHLENLQRL 

PPDTLLQQVTLSHGGPENPQNVVNVAGGFSLHQ 

DPTOFKSIHCLLYPDTDWCPROKQGAPTSR 


3776 


A 


3 


796 

r •' ..'-V v 


"PRAKLGTRARNMAGQDAGCGRGGDDYSEDEGD 
SSVSRAAVEVFGKLKDLNCPFLEGLYITEPKTIQE 
LLGSPSEYRLEILEWMCTRVWPSLQDRFSSLKGV 
riEVKlQEMTKlGHELMLCAPDDQELLKGCACA 
QKQLHFMDQLLDTIRSLTIGCSSCSSLMEHFEDT 
REKNEALLGELFSSPHLQMLLNPECDPWPLDMQ 
PLLNKQSDDWQWASASAKSEEEEKLAELARQLQ 
ESAAKLHALRTEYFAQHEQGAAAGAAVTSAP 


3111 


A 


3 


413 


SEEDVIEGKTAVIEKRRKKRSSAGVVED/IGGEVQ 
NMLEGVGVDINKALLAKRKRLEMYTKASLRTSN 
QKIEHVWKTQQDQRQKLNQEYSQQFLTLFQQW 
DLDMQKAEEQEEKILVGIMIRFIINQVSSRNGQPS 

LLL 


311S 


A 


132 


788 


SRLPPPPPHLADGRAGARVPRSARLSRAVWVgu 

WTHGPIVRPPAAARTMWVNPEEVLLANALWITE 

RANPYFILQRRKGHAGDGGGGGGLAGLLVGTLD 

WLDSSARVAPYRILYQTPDSLVYWTIACGNGSR 

mXEHWEWLEQNLLQTLSIFENENDITTFVRGKI 

QGILVEYNKINDVKEDDDTEKFKEAIVKFHRLFG 

MPEEEKLVNYYSCSYWKG 


3779 


A 


2 


934 


CKSCTLFPQNPNLPPPSTRERPPGCKTVFVGGLPE 

NATEEIIQEVFEQCGDITAIRKSKKNFCHIRFAEEF 

MVDKAIYLSGYRMRLGSSTDKKDSGRLHVDFA 

QARDDFYEWECKQRMRAREERHRRKLEEDRLR 

PPSPPAIMHYSEHEAALLAEKLKDDSKFSEAM\Q 

T7T T c\T7TCT>nt:vNrpp\QA>jnFYSMVOSANSHVRRL 

MNEKATHEQEMEEAKENFKNALTGILTQFEQIV 

AVFNASTRQKAWDHFSKAQRKMDIWAKVHSEE 

Lltt^AQSEQLMGIRRHEEMEMSDDENCDSPTKKM 

RVDESALGAP 


3780 


A 


1 


2535 


" AAQAEREELAAGRJMPGGGPQGAPAAAGGCiOVS 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
' corresponding 
to fim amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G=Glycine, H^Histidine, 
I^Isoleucine, K=Lysiney L^Leucine, M^Methionine, 
N^Asparaginc, P=Prolinc, Q=<jlutamine, R»Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptopban, Y=Tyrosinc 
X=Unknown, *^top codon, A=possible nucleotide deletion^ 
V=possible nucleotide insertion 










HRAGSRDCLPPAACFRRKRLARRPGYMRSSTGP 

GIGFLSPAVGTLFRFPGGVSGEESHHSESRARQC 

GLDSRGLLVRSPVSKSAAAPTVTSVRGTSAHFGI 

QLRGGTRLPDRLSWPCGPGSAGWQQEFAAMDS 

SETLDASWEAACSDGARRVRAAGSLPSAELSSNS 

CSPGCGPEVPPTPPGSHSAFTSSFSFIRLSLGSAGE 

RGEAEGCPPSREAESHCQSPQEMGAKAASLDGP 

HEDPRCLSQPFSLLATRVSADLAQAARNSSRPER 

DMHSLPDMDPGSSSSLDPSLAGCGGDGSSGSGD 

AHSWDTLLRKWEPVLRDCLLRJJRRQMEVISLRL 

KLQKLQEDAVENDDYDKAETLQQRLEDLEQEKI 

SLHFQLPSRQPALSSFLGHLAAQVQAALRRGATQ 

QASGDDTHTPLRMEPRLLEPTAQDSLHVSITRRD 

WLLQEKQQLQKEEEALQARMFVLEAKDQQLRRE 

lEEQEQQLQWQGCDLTPLVGQLSLGQLQEVSKA 

LQDTLASAGQIPFHAEPPEHRSLQERIKSLNLSLK 

EITTKVCMSEKFCSTLRKKVNDIETQLPALLEAK 

MHAISGNHFWTAKDLTEEIRSLTSDREGLEGLLS 

KLLVLSSRNVKKLGSVKEDYNRLRREVEHQETA 

YETSVKENTMKYMETLKNKLCSCKCPLLGKVW 

EADLEACRLLIQCLQLQEARGSLSVEDERQMDD 

LEGAAPPIPPRLHSEDKRKTPLKESYILSAELGEK 

CEDIGKKLLYLEDQLHTAIHSHDEDLIQSLRRELQ 

MVKETLQAMILQLQPAKEAGEREAAASCMTAG 

VHEAQA 


3781 


A. - 




995 . - 


GRRRAGPAHSARMYNMMETELKPPGPQQTSGG 

GGGNSTAAAAGGNQKNSPDRVKRPMNAFMVW 3 

SRGQRRmAQEWKMHNSm^ ' 

TEKRPProEAKRLRALHMKEHPDYKYRPRRKTK 

TLMKKDKYTLPGGLLAPGGNSMASGVGVGAGL 

GAGVNQRMDSYAHMNGWSNGSYSMMQDQLG 

YPQHPGLNAHGAAQMQPMHRYDVSALQYNSM 

TSSQTYMNG/SRPTYSMSYSQQGTPGMAPGS\MG 

SWKSEASSSPPWTSSSHSRAPCQAGDLRDMIS 

MYLPGAEVPEPAAPSRLHMSQHYQSGPVPGTAI 

NGTLPLSHM 


3782 


A 


1 


2649 


FRVPDSCPVVLHSFTQLDPDLPRPESSTQEIGEELI 

NGVIYSISLRKVQLHHGGNKGQRWLGYENESAL 

M.YETCKVRTVKAGTLEBXVEHLVPAFQGSDLS 

YVTZFLCTYRAFTTTQQVLDLLFKRYGRCDALTA 

SSRYGCELPYSDEDGGPQDQLKNAISSILGTWLD 

QYSEDFCQPPDFPCLKQLVAYVQL>MPGSDLER 

RAHLLLAQLEHSEPIEAEPEGEEDWALSPVPALK 

PTPELELALTPARAPSPVPAPAPEPEPAPTPAPGSE 

LEVAPAPAPELQQAPEPAVGLESAPAPALELEPA 

PEQDPAPSQTLELEPAPAPVPSLQPSWPSPWAEN 

GLSEEKPHLLVFPPDLVAEQFTLMDAELFKKWP 

YHCLGSIWSQRDKKGKEHLAPTIRATVTQFNSV 

ANCVITTCLGNRSTKAPDRARV^HWIEVAREC 

RIUCNFSSLYAILSALQSNSIHRLKKTWEDVSRDS 

FRIFQKLSEIFSDENNYSLSRELLIKEGTSKFATLE 

MNPKRAQKRPKETGnQGTVPYLGTFLTDLVML 

DTAMKDYLYGRLINFEKRRKEFEVIAQIKLLQSA 

CNNYSIAPDEQFGAWFRAVERLSETESYNLSCEL 

EPPSESASNTLRTKKNTAIVKRWSDRQAPSTELS 
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S£Q]]> 1 
NO: 


Vfetbod ] 

I 
1 
1 


Predicted J 
beginning i 
nucleotide 1 
ocation < 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


[Predicted end i 
oucleotide 
ocation 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


\mino acid sequence (A=Alanine C^Cysteinc, D^'Aspartic Acid. 

n A*i^*.«tM Aoiji iTBsPh^nviftiftnin^ OsGIvcioct H~Histidio6> 
SBGIOtsniiC ACIuj f^^rnenyiiiiiiniuc, ^j— vijwa^ij » 

(=IsoIeucine, K=Lysiiie, lf=L«uciiit, M=Methioiiine, 

N=AsparagiDe, P=Proline, Q=Glutamlne, R=An6iiune, S=Senne, 

T=Threonine, V=VaUne, W=Tryptophaii, V=Tyrpsine, 

X=Unknown, *=Stop eodon, /^possible nndeotide ddetioii, 

^=passible nucleotide insertion 










TSGSSHSKSCDQLRCGPYLSSGDIADALSVHSAU 

SSSSDVEEINISFVPESPDGQEKKFWESASQSSPET 

SGISSASSSTSSSSASTTPVAATRTHKRSVSGLCNS 

SSALPLYNQQVGDCCnRVSLDVDNGNMYKSILV 

TSQDKAPAVIRKAMDKHNLEEEEPEDYELLQILS 

DDRKLKIPENANVFYAMNSTANYDFVLKKRTFT 

KGVKVKHGASSTLPRMKQKGLKIAKGIF 


3783 


A 


3 


869 


RSGQGKVYGLIGRfURFQQMDVLEGLNLLITISGK 

KNKLRVYYLSWLRNKILHNDPEVEKKQGWTrV 

GDMEGCGHYRWKYERIKFLVIALKSSVEVYAW 

APKPYHKFMAFKSFADLPHRPLLVDLTVEEGQR 

LKVIYGSSAGFHAVDVDSGNSYDIYIPVHIQSQIT 

PHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIK 

DWLQWGEMPTSVAYlCSNQIMGWGEKAffilRS 

VETGHLDGVFMHKRAQRLKFLCERNDKVFFASV 

RSGGSSOVYFMTLNRNCIMNW 


3784 


A 


1213 


457 


LSPRQVDGLAGLQKGLSLSLLYQFLMNUm-Ol Y 

GLAEAGGYLHTAEGTHSPARSAAAGAMAGVMG 

AYLGSPIYMVKTHLQAQAASEIAVGHQYKHQQ 

MFQALTEIGQKHGLVGLWRGALGGLPRVIVGSS 

TQLCTFSSTKDLLSQWEEFPPQSWKLALVAAMM 

SGIAWLAMAPFDVACTRLYNQPHRCTGQGP\LY 

RGELDALLQTARTEGIFGMYKGIGASYFRLGPHTI 

LSLFFWDOLRSLYYTDTK 


3785 


A 

A 




813 


RRRGRHSLCGGKMLAYCVQDATVVDVliKJ*XNi' 

SKHYVYIINVTWSDSTSQTIYRRY\SKFFDLQMQL 

LD\KFPI\ESGQKDPKQRIIPFLPGKILFRRSHIRDV 

AVKRLKPIDEYCRALVRLPPHISQCDEVFRFFEAR 

PEDVNPPKEQGPSPPDAVLPYGVNKGKQELKAG 

PNWPGRTHHVVNCVTQKCLFVFHFKFSSSGNKE 

SKSL 


3786 


A 


3785 


1632 


EFVGRAASTTVVTRIAWRMADAGIRRVVPSJJl-Y 

PLVLGFLRDNQLSEVANKFAKATGATQQDANAS 

SLLDIYSFWLNRSAKVPERKLQANGPVAKKAKK 

KASSSDSEDSSEEEEEVQGPPAKKAAVPAKRVGL 

PPGKAAAKASESSSSEESSDDDDEEDQKKQPVQ 

KGVKPQAKAGQAPPKKAKSSDSDSDSSSEDEPP 

KNQKPK1TP\VTVKAQTKAPPKPARA\APKIANGK 

AASSSSSSSSSSSSDDSEEEKAAATPKKTVPKKQV 

VAKAPVKAATTPTRKSSSSEDSSSDEEEEQKKPM 

KNKPGPYSSVPPPSAPPPKKSLGTQPPKKAVEKQ 

QPVESSEDSSDESDSSSEEEKKPPTKAVVSKATTK 

PPPAKKAABSSSDSSDSDSSEDDEAPSBCPAGTTK 

NSSNKPAVTTKSPAVKPAAAPKQPVGGGQKLLT 

RKADSSSSEEESSSSEEEKTKKMVATTKPKATAK 

AALSLPAKQAPQGSRDSSSDSDSSSSEEEEEKTSK 

SAVKKKPQKVAGGAAPSKPASAKKGKAESSNSS 

SSDDSSEEEEEKLKGKGSPRPQAPKANGTSALTA 

QNGKAAKNSEEEEEEKKKAAVVVSKSGSLKKR 

rJi-N-vTCA AvcAtjrpnAK'K'TKLOTPNTFPBCRKKGEK 

RASSPFRRVREEEDEVDSRVADNSFDAKRGAAGD 

WGERANQVLKFTKGKSFRHEKTKKKRGSYRGG 

SISVOVNSIKFDSE 


3787 


A 


3 


. 5078 


IPEQ/RALSAEHl'SSLVPSLHITTLGQEyAlLSUAV 
PASPSTGTADFPSILTTLQPTENHASPSPVPEMPTL 



435 



wo 01/57190 



PCTAJSOl/04098 



S£QU> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A==Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F-Phenylalanine, G==Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, I>Leucine, M-Methiooine, 
N^Asparagine, P=ProIine, Q==GlDtamine, R^Arginioe, S=Serine, 
T=Threonine, V=Valinc, W=TryptopIian, Y=Tyrosine, 
X=linknown, *=Stop codon, /^possible nndeotide deletion, 
\==possible nucleotide insertion 










PAEGSDGSPPATRDLLLSSKVPNLLSTSWTFPRW 
KKDSVTAILGKNEEANVTIPLQAFPRKEVLSLHT 
VNGFVSDFSTGSVSSPnTAPRTNPLPSGPPLPSILS 
IQATQTVFPSLLAFSSTKPEVYAAAVDHSGLPAS 
APKQVRASPSS^4DVYDSLTIGDMKKPATTDVFW 
SSLSAETGSLSTESnSGLQQQTNYDLNGHTISTTS 
WETHLAPTAPPNGLTSAADAIKSQDFKDTAGHS 
VTAEGFSIQDLVLGTSIEQPVQQSDMTMVGSHID 
LWPTSNNNHSRDFQTAEVAYYSPrmHSVSHPQ 
LQLPNQPAHPLLLTSPGPTSTGSLQEMLSDGTDT 
GSEISSDINSSPERNASTPFQNILGYHSAAESSISTS 
VFPRTSSRVLRASQHPKKWTADTVSSKVQPTAA 
AAVTLFLRKSSPPALSAALVAKGTSSSPLAVASG 
PAKSSSMTTLAK^sTVTNKAASGPKJlTPGA^^ 
PFIPTYMYARTGHTTSTHTA/IARKHGHCLWPVV 
YNLP/PP/GKPQAMHTOLPNPTNLEMPRASTPRPL 
TVTAALTSITASVKATRLPPLRAENTDAVLPAAS 
AAWTTGKMASNLECQMSSKLLVKTVLFLTQRR 
VQISESLKFSIAKGLTQALRKAFHQNDVSAHVDI 
LEYSHNVTVGYYATKGKLVYLPAWIEMLGVY 
GVSNVTADLKQHTPHLQSVAVLASPWNPQPAG 
YFQLKTVLQFVSQADNIQSCKFAQTMEQRLQKA 
FQDAERKVLNTKSNLTIQIVSTSNASQAVTLVYV 
VGNQSTFLNGTVASSLLSQLSAELVGFYLTYPPL 
TIAEPLEYPNLDISETTRDYWVITVLQGVDNSLV 
GLHNQSFARVMEQRLAQLFMMSQQQGRRFKRA 
^*TTLGSYWQA4VmQRyPGPKD*PAELT\^ - 
GKPLLGTAAAKIl^TTOSQRi^ 
PVVKNPPNl^MIAAVLAPIAVVTVmilTAVLCR 
KNKNDFBCPDTMINLPQRAKPVQGFDYAKQHLG 
QQGADEEVIPVTQETVVLPLPIRDAPQERDVAQD 
GSTIKTAKSTETRKSRSPSENGSVISNESGKPSSGR 
RSPQNVMAQQKVTKEEARKRNVPASDEEEGAV 
LFDNSSKVAAEPFDTSSGSVQLIAIKPTALPMVPP 
TSDRSQESSAVLNGEVNKALKQKSDIEHYRNKL 
RLKAKRKGYYDFPAVETSKGLTERKKMYEKAP 
KEMEHVLDPDSELCAPFTESKNRQQMKNSVYRS 
RQSLNSPSPGETEMDLLVTRERPRRGIRNSGYDT 
EPEUEETNIDRVPEPRGYSRSRQVKGHSETSTLSS 
QPSIDEVRQQMHMLLEEAFSLASAGHAGQSRHQ 
EAYGSAQHLPYSEWTSAPGTMTRPRAGVQWVP 
TYRPEMYQYSLPRPAYRFSQLPEMVMGSPPPPVP 
PRTGPVAVASLRRSTSDIGSKTRMAESTGPEPAO 
LHDSASFTQMSRGPVSVTQLDQSALNYSGNTVP 
AVFAIPAANRPGFTGYFIPTPPSSYRNQAWMSYA 
GENELPSQWADSVPLPGYDBAYPRSRYPQSSPSRL 
PRQYSQPANLHPSLEQAPAPSTAASQQSLAENDP 
SDAPLTNISTAALVKAIREEVAKLAKKQTDMFEF 
QV 


3788 


A 


2 


1737 


MKGLYTDAEMKSDNVKDKDAKISFLQKAIDVV 

VMVSGEPLLAKPARIVAGHEPERTNELLQnGKC 

CLNKLSSDDAVRRVLAGEKGEVKGRASLTSRSQ 

ELDNK>rVREEESRVHKNTEDRGDAEIKERSTSRD 

RKQKEELKEDRMPREBCDKDKEKAKENGGNRHR 

EGERERAKARARPDNERQKDRGNRERDRDSERK 
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1 SEQID" ^ 
NO: 


lethod I 
I 
1 
I 
c 
1 

1 
J 


Predicted p 
beginning n 
incleotide I 
ocation c 
corresponding t 
0 first amino i 
icid residue of | 
>eptide i 
lequence 


Predicted end / 
nucleotide 1 
ocation ^ 
orrcsponding T 
0 last amino 
icid residue of ' 
[peptide ^ 
iequence 


[SiSFidd sequence (A=Aianme C-J;ystc.ne, ^'^^^l^l^^'''* 
^.oi„»«m:#> Arid FssPhenvl alanine. G^lyclne, H^Histiainet 
=Isoleudnc,K=Lysine,l>LeudncM«Methioninc, 
^=Asparagine, P=Proline, Q^Glutamine, R-Argimne, S^erine, 
r=Thrconinc, V=VaIine, W^Tryptophan, Y=Tyrosine, 
iC=Unknown, *=Stop codon, A=possiWe nadeotide ddetion, 
^possible nudeotide insertion 










KETERKSEGGKEKERLRDRDRKRDRDKGKDRDK 

RRVKNGEHSWDLDRENNREHDKPEKKSASSGE 

MSKKLSDGTFKDSKAETETEISTRASKSLTTXTS 

KRRSKNSVEGDSTSDAEGDAGPAGQDKSEVPET 

PEIPNELSSNIRJRIPRPGSAKPAPPRVKRQDSMEAL 

OMDRSGSGKTVSNVITESHNSDNEEDDQFWEA 

APQLSH^SEEEMVTAVELEEEBKHGGLVIQaLET 

KKDYEKLQQSPKPGEKERSLFESAWKKEKDIVS 

KEIEKLRTSIQTLCKSALPLGKIMDYIQEDVDAM 

ONELQM\YHSENRQHAEALQQEQRITDCAVEP\L 

KAELA\ELEQLIKD\Q\QDKICAVKANILKNEEKIQ 

KMVYSINLTSRR 


3789 


A 


1 


4369 


MRTLGTCLATLAGLLLTAAGETFSGGCLFDEPYS 

TCGYSQSEGDDFNWEQVNTLTKPTSDPWMPSGS 

FMLVNASGRPEGQRAHLLLPQLKENDTHCIDFH 

YFVSSKSNSPPGLLNVYVKVNNGPLGNPIWNISG 

DPTRTWNRAELAISTFWPNFYQVIFEVrrSGHQG 

YLAIDEVKVLGHPCTRTPHFLRIQNVEVNAGQFA 

TFOCSAIGRTVAGDRLWLQGIDVRDAPLKEIKVT 

SSRRFIASFNVVNTTKRDAGKYRCMI\RTEGGVGI 

SNYAELWVKEPPVPIAPPQLASVGATYLWIQLN 

ANSINGDGPIVAREVEYCTASGSWNDRQPVDSTS 

YKIGHLDPDTEYEISVLLTRPGEGGTGSPGPALRT 

RTKCADPMRGPRKLEWEVKSRQITIRWEPFGY 

imRCHSYNLTVHYCYQVGGQEQVREEVSWDT 

ENSHPQHTITNLSPYTNVSVKLE.MNPEGKKESQ 

elivqtdedlpgaVptesiqgstfeekiflqwrep 

' TOlh'GviTLYEITYKAVSSFDPEIDLSNQSGRVSK 
LGNETOFLFFGLYPGTTYSFtlRASTAKGFGPPAT 
NQFTIXISAPSMPAYELETPLNQTDNTVTVMLKP 
AHSRGAPVSVYQIWEEERPRRTKKTTEE.KCYP 
VPIHFQNASLLNSQYYFAAEFPADSLQAAQPFTIG 
DNKTYNGYWNTPLLPYKSYRIYFQAASRANGET 
KIDCVQVATXGAATPKPVPEPEKQTDHTVKIAG 
VIAGILLFVIIFLGVVLVMKKRKL\AKKRKETMSS 
TRQEIDLWIGELNGPRSYAEQGTKLATRAFSFMD 
THNLNGRSVSSPSSFTMKTNTLSTSVPNSYYPDE 
THIMASDTSSLVQSHTYKKREPADVPYQTGQLH 
PAIRVADLLQHITQMKCAEGYGFKEEYESFFEGQ 
SAPWDSAKKDENRMKNRYGNIIAYDHSRVRLQT 
lEGDTNSDYINGNYIDGYHRPNHYIATQGPMQET 
lYDFWRMVWHENTASIIMVTNLVEVGRVKCCK 
YWPDDTEIYKDKVTLIETELLAEYVIRTFAVEKR 
GVHEIREIRQFHFTGWPDHQVPYHATGLLGFVR 
OVKSKSPPSAGPLWHCSAGAGRTGCFIVIDIML 
DMAEREGWDIYNCVRELRSRRVNMVQTEEQY 
VFIHDAILEACLCGDTSVPASQVRSLYYDMNKLD 
PQTNSSQIKEEFRTLNMVTPILRVEDCSIALLPKN 
HEKNRCMDELPPDRCLPFLITIDGESSNYINAALM 
r»cvTrnP«!AFIVTOHPLPNTVKDFWRLVLDYHCTS 
VVMLNDVDPAQLCPQYWPENGVHRHGPIQVEF 
VSADLEEOnSRIFRIYNAARPQDGYRMVQQFQFL 
GWPMYEDTPVSKRSFLKLIRQVDKWQEEYNGG 
EGRTWHCLNGGGRSGTFCAISIVCEMLRHQRTV 
DVFHAVKTlJ»JNKPNMNa)LLDOYKFCYEVALE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residne of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystcinc, D^Aspartic Acid, 
f^Glutamic Add, F=PhenyIaIanine, G=Glycinc, H=Histidine, 
2»Isoleudne, K^Lysinc, lr=Leucine, M=Methionlne, 
N=Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T^Threonlne, V=»Valine, W=Tryptophan, Y=Tyrosine, 
X='ljnknown, *=Stop codon, /=possible nudcotide deletion, 
\=possible nucleotide insertion 










YLNSG 


3790 


A 


261 


485 


EEQTPLHIASRLGKmVQLLLQHMAHPDAATTN 
GYTPLfflSAREGQVNDWASVLLGRQGAAHSFRLT 
KVRRMTS 


3791 


A 


1 

• «■ -t ■ "■ » . . . 


5874 


LPPVTMSGKYIMEEHDSYSDQVWSIDELPSKQG 

YYLQGhra.RCVAEVGSFEHNLTTDLLNHLVFVQ 

KVFMKEVNEVIQKVSGGEQPIPLWNEHDGTADG 

DKPKILLYSLNLQFKGIQVTATTPSMRAVRFETG 

LIELELSNRLQTKASPGSSSYLKLFGKCQVDLNL 

ALGQIVKHQVYEEAGSDFHQVAYFKTRIGLRNA 

LREEISGSSDREAVLITL2SIRPIVYAQPVAFDRAVL 

FWLNYK\AAYDNWNEQRMALHKDIHMATKEVV 

DMLPGIQQTSAQAFGTPFLQLTVNDLGICLPITNT 

AQSNHTGDLDTGSALVLTIESTLITACSSESLVSK 

GHFKNFCIRFADGFETSWDDWKPEIHGDLVMNA 

CWPDGTYEVCSRTTGQAAAESSSAGTWTLNVL 

WmCGmVHMDPMGKRLNALGNTLTTLTGEED 

IDDIADLNSVNIADLSDEDEVDmSPTIHTEATDY 

RRQAASASQPGELRGRKIMKRIVDIRELNEQAKV 

IDDLKKLGASEGTINQEIQRYQQLESVAVNDIRR 

DVRKKLRRSSMRAASLKDKWGLSYKPSYSRSKS 

ISASGRPPLKRMERASSRVGETEELPEIRVDAASP 

GPRVTFNIQDTFPEETELDLLSVTEEGPSHYSSNSE 

GSCSVFSSPKTPGGFSPGIPFQTEEGRRDDSLSSTS 

EDSEKDEKDEDHERERFYIYRKPSHTSRKKATGF 

AAVHQLFTERWPTIPVNRSLSGTATEKNIDFELD 

'IRV^EiDSGKCyLHPTTLLQEHDDISLRRSYDRSSR 

SLDQDSPSKJCKXFQTNYASTTHLMTGKKVPSSL 

QTKPSDLETTVFYIPGVDVKLHYNSKTLKTESPN 

ASRGSSLPRTLSKESKLYGMKDSATSPPSPPLPST 

VQSKTNTLLPPQPPPIPAAKGKGSGGVKTAKLYA 

WALQSLPEENmSPCLLDFLEKALETIPITPVER 

NYTAVSSQDEDMGHFEIPDPMEES\TTSLVS\SSTS 

AYSSFPVDWVYVRVQPSQIKFSCLPVSRVECML 

KLPSLDLVFSSNRGELETLGTTYPAETLSPGGNA 

TQSGTKTSASKTGIPGSSGLGSPLGRSRHSSSQSD 

LTSSSSSSSGLSFTACMSDFSLYVFHPYGAGKQIT 

AVSGLTPGSGGLGNVDEEPTSVTGRKDSLSINLE 

FVkVSLSRIRRSGGASFFESQSVSKSASK]VmrrLI 

NISAVCDIGSASFKYDMRRLSEILAFPRAWYRRSI 

ARRLFLGDQTINLPTSGPGTPDSIEGVSQHLSPESS 

RKAYCKTWEQPSQSASFTHMPQSPNVFNEHMTN 

STMSPGTVGQSLKSPASIRSRSVSDSSVPRRDSLS 

KTSTPFhnKLSNKAASQQGTPWETLVVFAINLKQL 

hrVQMNMSNVMGNTTWTTSGLKSQGRLSVGSNR 

DREISMSVGLGRSQLDSKGGWGGTIDVNALEM 

VAHISEHPNQQPSHKIQITMGSTEARVDYMGSSIL 

MGIFSNADLKLQDEWKVNLYNTLDSSITDKSEIF 

VHGDLKWDIFOVMISRSTTPDLIKIGMKLOEFFT 

QQFDTSKRALSTWGPVPYLPPKTMTSNLEKSSQE 

QLLDAAHHRHWPGVLKWSGCfflSLFQIPLPEDG 

MQFGGSMSLHGNHMTLACFHGPNFRSKSWALF 

HLEEPNUFWTEAQKIWEDGSSDHSTYIVQTLDF 

HLGHbHMVTKPCGALESPMATITKITRRRHENPP 

HGVASVKEWFNYVTATRNEELNLLRNVDANNT 
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SEQIO 
NO: 



3792 



3793 



Method 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 



add residue of peptide 



peptide 
sequence 



sequence 



Amino acid sequence (A=Alaninc C=Cystcme, D-Aspartic Acia, 
E=Glutamic Acid, F=Phcnylalaninc, G=Glycine, H=Histidine, 
I=Isoleucinc K=Lysine, D=Lcucine, M=Mcthionine, 
N=Aspara^ne, P-Proline, Q=Glntaminc, R=Argininc S«Senne, 
T=Threonine, V=VaUne, W=Tryptophan, Y=Tyrosinc, 
X-Unknown, *=Stop codon, A=possible nucleotide deletion, 
\r=possibIe nucleotide insertion 



ENSTTVKNSSLLSGFRGGSSYKHETETIFAimi 



364 



3794 A 



340 



421 



158 



3795 



OLDFKSIHVQEPQEPSLQDASLKPKVECSWTEF 
TDHICVTMDAELIMFLHDLVSAYLKEKEKAIFPP 
RILSTRPGQKSPIUHDDNSSDKDREDSITYTTVDW 
RDFMCNTWHLEPTLRLISWTGRKIDPVGVDYILQ 
KLGFHHARTTIPKWLQRGVMDPLDKVLSVLIKK 
LGTALQDEKEKKGKDKEEH 



ONGSTPLHHAASKNRHEIALMLLEGGANFDUKD 
HYEATAKHQATAKGNFKMIHILLYYKASTUQDT 
EGNTPPHLVCD\RVEEAKLLVSQGA/SIYffiNKEE 

JCTwn .OV AKGALGLVLKRMVEG 

DIVPNPKMAPLGDEAPTLEKVLTPELSEliliViilK 

DDIQFHHFSSEEALQKVKYFVAKEDPSSQEEAHT 

PEAPPPQPPSSERCLGEMKCTLVRGDSSPRQAEL 

KSGPASRPAL 



SYWGEDYTYm'EmiDPFHKAlKKNl'i^iWVva 
SKAVYKHREMCGLTSTGRKSHGLEKDRMFPHAI 
GGSCRAA*RRRKTLQFPCYH 



24 



592 



3796 A 



3797 



3798 



592 



1556 



GSS)SRVSGTTSNGETKPVYPVMEKJClibUUll.fi 
RGHWNNKMEFVLSVAGEIIGLGNVWRFPYLCYK 
NGGGAFFIPYLVFLFTCGIPVFLLETALGQYTSQG 
GVTAWRKICPIFEGIGYASQMIVILLNVYYnVLA 
WALFYLFSSFTIDLPWGGCYHEWNTEHCMEFQK 
TNGSLNGTSENATSPVIEFW 



KPASTYSTSQPSMAPLLPIRTLPLILlLLALLbJfGA 
ADFNISSLSGLLSPALTESLLVALPPCHLTGGNAT 
LMVRRANDSKVVTSSFVVPPCRGRRELVSWDS 
GAGFTVTRLSAYQVTNLVPGTKFYISYLVKKGT 
ATESSREIPMFTLPRRNMESIGLGMARTGGMVVI 
TVT .LSVAMFLLVLGFIIALALGSRK 



73 



759 



ATRLLRGSGSWGCSRLRFGPPAYRRFSSCiCiAYPN 
PLSSPLPGVPKPVFATVDGQEKFETKVTTLDNGL 
RVASQNKFGQFCTVGILrNSGSRYEAKYLSGIAH 
FLEKLAFSSTARFDSKDEILLTLEKHGGICDCQTS 
RDTTMYAVSADSKGLDTVVALLADVVLQPRLT 
DEEVENraRMAVQFELEDLNLRPDPEPLLTEMIHE 
AAYRENTVGLHRFCPTENVAKINREVLHSYLRN 
YYTPDRMVLAGVGVEHEHLVDCARKYLLGVQP 
AWGSAEAVDIDRSVAQYTGGIAKLERDMSNVSL 
GPTPIPELTHIMVGLESCSFLEEDFIPFAVLNMMM 
GGGGSFSAGGPGKGMFSRLYLNVLNRHHWMYN 
ATSYHHSYEDTGLLCIHASADPRQVREMVEIITK 
EFn.MGGTVDTVELERAKTQLTSMLM]vmLESRP 
VIFEDVGRQVLATRSRKLPHELCTLIRNVKPEDV 
KRVASKMLRGKPAVAALGDLTDLPTYEfflQTAL 

SSKPGRLP RTYRLFR 

KRLVEAGVPRTFDGIVGEGGAQSRSCWPWGVTA 



OTPAFSADSLNCLKNCMSITMGSVRPSVEQFHKY 
LPWFLNDRPNIKCPKGGLAAYSTSVNLTSDGQV 
LASRFMAYHKPLKNSQDYTEALRAARELAANIT 
ADLRKVPGTDPAFEVFPYTITNVFYEQYLTILPEG 
LFMLSLCLVPTFAVSCLLLGLDLRSGLLNLLSIV 
MILVDTVGFMALWGISYNAVSLINLVS 



3799 



73 



759 



KRLVEAGVP RTFDGIVGEGGAQSRSCWPWGVTA 
OTPAFSADSLNCLKNCMSnMGSVRPSVEQFHKY 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of . 
peptide 
sequence 


Amino acid sequence (A«=Alanine OfCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H-Histidine, 
I=^IsoleQcine, K^Lysine, I^Leucine, M=Mcthionine, 
N==Asparagine, P=Prolinc, Q=Glutanune, R=Argimnc, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X>>Unluiown, *«Stop codon, /=70ssible nucleotide deletion, 
V^ossible nucleotide insertion 










LPWFLNDRPNIKCPKGGLAAYSTSVNLTSDGQV 

LASRFMAYHKPLKNSQDYTEALRAARELAANIT 

ADLRKVPGTDPAFEVFPYTITNVFYEQYLTILPEG 

LFMLSLCLVPTFAVSCLLLGLDLRSGLLNLLSIV 

MBLVDTVGFMALWGISYNAVSLINLVS 


3800 


A 


250 


1032 


GIFRSLRVLFPLFSVGRPQFARSLSAAPQLSDTAD 

TMGFGDLKSPAGLQVLNDYLADKSYEEGYVPSQ 

ADVAVFEAVSSPPPADLCHALRWYNHIKSYEKE 

KASLPGVKKALGKYGPADVEDTTGSGATDSKD 

DDDIDLFGSDDEEESEEAKRLREERLAQYESKKA 

KKPALVAKSSILLDVKPWDDETDMAKLEECVRS 

IQADGLVWGSSBCLVPVGYGIKKLQIQCVVEDDK 

VGTDMLEEQITAFEDYVQSMDVAAFNKI 


3801 


A 


155 


656 


SREMELVTFRDVAIEFSPEEWKCLDPAQQNLYR 

DVMLENYRNLVSLGFVISNPDLVTCLEQIKEPCN 

LKIHETAAKPPAICSPFSQDLSPVQGIEDSFHKLIL 

KRYEKCGHENLQLRKGCKRVNECKVQKGVNNG 

VYQCLSTTQSKIFQCNTCVRVFSTSSHSNKHK 


3802 


A 


1 


1428 


VTVSPETHMDLTKGCVTFEDIAIYFSQDEWGLLD 
EAQRLLYLEVMLENFALVASLGCGHGTEDEETP 
SDQNVSVGVSQSKAGSSTQKTQSCEMCVPVLKD 
ILHLADLPGQKPYLVGECTNHHQHQKHHSAKKS 
LKRDMDRASYVKCCLFCMSLKPFRKWEVGKDL 
PAMLRLLRSLVFPGGKKPGTITECGEDIRSQKSH 
YKSGECGKASRHKHTPVYHPRVYTGKKLYECSK 
- CGKAFRGKYSLVQHQRVHTGEPa>WECNECGKF 
FSQTSHLNDHiRRiHTG 

LVDHQKIHTGARPYECSQCGKSFSQKATLVKHQ 

RVHTGERPYKCGECGNSFSQSAILNQHRRIHTGA 

KPYECGQCGKSFSQKATLIKHQRVHTGERPYKC 

GDCGKSFSQSSELIQHRRIHTGARPYECGQCGKSF 

SQKSGLIQHQVVHTGERPYECNKCGNSFSQCSSL 

IHHQKCHNT 


3803 


A 


193 


617 


LFPFLGSESKNGEADSSDKEMKHGQKSPTGKQTS 

QHLKRLKKSGLGHLKWTKAEDIDIETPGSILVNT 

NLRALIhOaHTFASLPQHFQQYLLLLLPEVDRQMG 

SDGILRLSTSALNNEFFAYAAQGWKQRLAEGKF 

VFSIIM 


3804 


A 


197 


479 


SSSRASPPEHPSSQAHCGPLVLSHACPEVTNKWS 
TGSSSSPNSSWVSSPLQPEGLSGSSRMKGGSATKI 
LLETLLLAAHMTADQGIASSQRCLL 


3805 


A 


1 


385 


QSADTLFPGDINFNVSGLFSAVTLQDTVSDRLAS 
EELPSTAVPTPATTPAPAPAPAPATAPALVSAAT 
KBRTESEVPPRPASPKVTRSPPETAAPVEDMARR 

SELAVGGEEGTEGGRGEGTGSPMSSY 


3806 


A 


47 


1033 


LQGDTWHLSFLSHFSRLHGGVPGRGLLEGNLLQ 

PQAPGHDMTSIPFPGDRLLQVDGVILCGLTHKQA 

VQCLKGPGQVARLVLERRVPRSTQQCPSANDSM 

GDERTAVSLVTALPGRPSSCVSVTDGPKF*SSN* 

KRIANGLGFSFVQMEKESCSHLKSDLVRIKRLFP 

GHPAEENGAIAAGDULGREWEGPRKASSSRCRG 

SWAMQLSVQAGPSFASYYPAAVEVLHLLRGAPQ 

EVTLLLCRPPPGALPELEQEWQTPELSADKEFTR 

ATCTDSCTSPILGSRGQLGGTVPPQMQGKAWGL 

RPESSQKAIREGTMGAKTERDLGPVP 



440 



PCT/USOl/04098 
Jne» D^Aspartic Acid, 



SEQID ) 
NO: 


Method 1 J 

1 

1 


Predicted J 

t)eginning 

nucleotide 

ocation 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end j 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


\niino acid sequence (A=Alaninc COysteme^ D=Asparttc Acid, 
D /^i..4^Amif* i7=Phpnvlalfliiine. G^GIvciDe. &^Hi5tidlne« 
Nlsoleucine, K=LysiBe, D=Lenelne, M=MetlUoiiiiie, 
N=Aspara^ne, P=ProUne, Q=Gli[tamliie, R^Arginine, S=Seruie, 
T=Threonine, V=Valine, W=Tryptophan, Y=TyK>siBe, 
X-'Unknown, *=Stop codoo, /^possible nodeotiile deli^on, 
\FpossibIe nucleotide insertion 


3807 


A 


656 


1238 


RCPSLLPPSWPLFILQTLTKTPGNKAIAGQAULW 

AVLWGSERTPPYR*GN*NQRGAVPCLRPHRLRP 

QDKFLVLASDGLWDMLSNEDWRLWGHLAEA 

DWHKTDLAQRPANLGLMQSLLLQRKASGLHEA 

DQNAATRLIRHAIGNNEYGEMEAERLAAMLTLP 

RDLARMYRDDrrVTWYFNSESlGAYYKGG 


3808 


A 


26 


2195 


SQYSESVAGRQASPERLLGSYHAMASTVEGUDl 

ALLPEFPRGPLDAYRARASFSWKELAUTEGEG 

MLRFKKTIFSALENDPLFARSPGADLSLEKYREL 

NFLRCKRIFEYDFLSVEDMFKSPLKVPALIQCLG 

MYDSSLAAKYLLHSLVFGSAVYSSGSERHLTYIQ 

KIFRMEEFGCFALTELSHGSNTKAIRTTAHYDPAT 

EEFIfflSPDFEAAKFWVGNMGKTAlHAWFAKL 

CVPGDQCHGLHPFIVQIRDPK'n.LPMPGVMVGDl 

GKKLGQNGLDNGFAMFHKVRVPRQSLLNRMGD 

VTPEGTYVSPFKOVRQRFGASLGSLSSGRVSIVSL 

AILNLKLAVAIALRFSATRRQFGFTEEEEIPVLEY 

PMQQWRLLPYLAAVYALDHFSKSLFLDLVELQR 

GLASGDRSARQAELGREIHALASASKPLASWTT 

QQGIQECREACGGHGYLAMNRLGVLRDDNDPN 

CTYEGDNNILLQQTSNYLLGLLAHQVHDGACFR 

SPLKSVDFLDAYPGE.DQKFEVSSVADCLDSAVA 

LAAYKWLVCYLLRETYQKLNQEKRSGSSDFEAR 

NKCQVSHGRPLALAFVELTVVQRFHEHVHQPSV 

PPSLRAVLGRLSALYALWSLSRHAALLYRGGYF 

SGEQAGEVLESAVLALCSQLKDDAVALVDVIAP 

PDFVLDSPIGRADGELYKNLWGAVLQESKVLER 

ASWWPEFSVNKPVIGSLKSKL 


3809 


A 


117 


830 


CFGIMERVGCTLTTTYAHPRPTPTNFLPAIS 1 MAb 

SYRDRFPHSNLTHSLSLPWRPSTYYKVASNSPSV 

APYCTRSQRVSENTMLPFVSNRTTFFTRYTPDDW 

YRSNLTNYQESNTSRHNSEKLRVDTSRLIQDKYQ 

QTRKTQADTTQNLGBRVNDIGFWKSBIIHELDEM 

IGETNALTDVKKRLERALMETEAPLQVARECLF 

HREKRMGIDLVHDEVEAQLLTVNVGEMHQSQA 

A 


3810 


A 


3 


518 


VIQELEGGSGADLGEHSCRPASQPRFFRPAtAKb 
HPATRRPASGPAMGKTNSKLAPEVLEDLVQNTE 
FSEQELKQWYKGFLKDCPSGE.NLEEFQQLYIKF 
FPYGDASKFAQHAFRTFDKNGDGTIDFRHICAL 
SVTSRGSFEQKLNWAFEMYDLDGDGRTTRLEML 

EHE 


3811 


A 


81 


1147 


' GCGYGCSGAGGAAlGEPMAKWGliUJJPRWIVEE 
RADATNVNNWHWTERDASNWSTDKLKTLFLAV 
QYQNEEGKCEVTEVSKLDGEASINNRKGKLIFFY 
EWSVKLNWTGTSKSGVQYKGHVEIPNLSDENSV 
DEVEISVSLAKDEPDTNLVALMKEEGVKLLREA 
MGIYISTLKTEFTQGMILPTMNGESVDPVGQPAL 
KTEERKAKPAPSKTQARPVGVKIPTCKITLKETFL 
-reoTjtsT VPVBTTOPT VOAFTHAPATLEADRGGKF 

hmvdgnvsgeftdlvpebchivmkwrfkswpeg 
hfatitltfidkngetelcmegrgffapeeertrq 
gworyyfegikqtfgygarlf 


3812 


A 


20 


558 


■ PCGTAASTHAYDRRAKCRQQQQQQQNGGl^NKV 
RPAKKKTSPAREVSSESGTSGQFIPPSSTSVPTIAS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C'^^ysteine, D-Aspartic Acid, 
E>=Glutaniic Acid, F-Phenylalanine, G^GIycine, H^Histidine, 
I»Isoleudne, K=Lyslne, L^Leudne, M=Methionine, 
N=Asparagine, pi=Proline, Q=Glutaniine, R=Arginine, S=Scrine, 
T=Threonine, V=VaIine, W^Tryptopban, Y=Tyrosinc 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nudeotide insertion 










SSAPVSIWSPASISPLSDPLSTSSSCMQRSYPMTYT 
QASGYSQGYAGSTSYFGGMDCGSYLTPMHHQL 
PGPGATLSPMGTNAVTSHLNQSPASLSTQGYGAS 
KLWGFNFNH 


3813 


A 


1 


1016 


CTEPPRRSTRTPAALASLRPYTDYVVVSDQILQES 

EDFFTLIESHEGKPLKLMVYNSKSDSCREVTVTP 

NAAWGGEGSLGCGIGYGYLHRIPTQPPSYHKKPP. 

SDYMEALLQAPGSSMEDPLPGPGSPSHSAPDPDG 
LPHFMETPLQPPPPVQRVMDPGFLDVSGISLLDN 

SNASVWPSLPSSTELTTTAVSTSGPEDICSSSSSHE 
RGGEATWSGSEFEVSFLDSPGAQAQADHLPQLT 
LPDSLTSAASPEDGLSAELLEAQAEEEPASTEGLD 
TGTEAEGLDSQAQISTTE*HPGL*QGP 


3814 


A. 


2 


884 


VFWQVRNAGSSPLSAACPLFRTPAPQPCGSWGR 
CCIPHASTGCRPMAERGELDLTGAKQNTGVWLV 

VSFTLNEDLANIHDIGGKPASVSAPREHPFVLQSV 

GGQTLTVFTESSSDKLSLEGIVVQRAECRPAASE 

NYMRLKRLQIEESSKPVRLSQQLDKWTTNYKP 

FSAFEKHQYYNLKDLVDITKQPWYLKEILKEIG 
VQNVKGIHKNTWELKPEYRHYQGEEKSD 


3815 


A 

!*•/': --r-^ 


•17 


411 


NIGDWEDIGKSPERIIQYYGPATWAQDGSRGYCT 
PIYMLNfflIRLQA\a.EIIMNERANALDLLAQQTTK 
MRNANYQNRLALDYLLAHEGGV+GKFSLINCC 
-KEIDDNGKADtelTARMRKLAHPVQTWER 


3816 


A " 


3 ' 


1172 


SHWQRRDRRCVRNMAERGRKRPCGPGEHGQRI 

EWRKWKQQKKEEKKKWKDLKLMKKLERQRAQ 

EEQAKRLEEEEAAAEKEDRGRPYTLSVALPGSIL 

DNAQSPELRTYLAGQIARACAIFCVDEIVVFDEE 

GQDAKTVEGEFTGVGKKGQACVQLARILQYLEC 

EESEFREGVVVDRPTRPGHGSFVNCGMKKEVKT 

DKNLEPGLRVTVRLNQQQHPDCKTYHGKWSS 

QDPRTKAGLYWGYTVRLASCLSAVFAEAPFQDG 

YDLTIGTSERGSDVASAQLPNFRHALWFGGLQG 

LEAGADADPNLEVAEPSVLFDLYVNTCPGQGSR 

TIRIIBEAILISLAALQPGLIQAGARHT 


3817 


A 


246 


1197 


flsagmsnfthyayllmieslmlgkvpphvpsh 

hfifhddgsarqkgesdykviiqqwfsksgpwtt 

ssnvtwgllelqqsisesavlteppgdsgagsnli 

kteqpgeplehvyvtdchavalesrhqkgelqc 

likmceplskplqmffspphweawlqrvqqlak 

ntryfrqrlqemghiygnenaswplllympg 

kvaafArhmlekkigwwgfpatplaeararf 

cvsaahtremldtvlealdemgdllqlkysrh 

kksarpelydetsfeled 


3818 


A 


215 


789 


npqssssegsseifqvnghnrllvqrsevtqapg 

qytvdveghgctfiqatlkynvllpkkasgfsls 

leivknysstafdltvtlkytgirnkssmvvidv 

KMLSGFTPTMSSffiELENKGQVMKTEVKNDHVL 
FYUEhTVFGRADSFTFSVEQSNLVFNIQPAPGMVY 
DYYEKEEYALAFYHINSSSVSE 
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SEQID 
NO: 


Method 


Predicted 1 
beginning 
nucleotide 1 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystcine, D-Aspartic Acid, 
£:=G]otaii]ic Acidf r—xncDyiaianiDe, oijfum:! a msuujwt 
I^Isoleudne, K=Lysine, I^Ltucine, M^MetUoniDC, 
N=Asparagine, P=ProUne, Q=Glutainine, I^ArgioinCi S-Serin^ 
T-Threonine, V=Valine, W^ryptophan, Y=Tyrosine, 
X^Uoknown, *=«top codoD, /^possible nucleotide deledon, 
V>possible nucleotide insertion 


3819 


A 


1 


1483 


RIPDSnSRGVQGLPRDTASLbTlPSHSPKAQAlSK 

LSTASCPTPKVQSRCSSKENILRASHSAVDrrKVA 

RRHRMSPFPLTSMDKAFITVLEMTPVLGTOIINYR 

DGMGRVLAQDVYAKDNLPPFPASVKDGYAVRA 

ADGPGDRFnGESQAGEQPTQTVMPGQVMRVTT 

GAPIPCGADAWQVEDTELIRESDDGTEELEVRIL 

VQARPGQDERPIGHDIKRGECVLAKGTHMGPSEI 

GLLATVGVTEVEVNKFPWAVMSTGNELLNPED 

DLLPGKIRDSNRSTLLATIQEHGYPTINLGIVGDN 

PDDLLNALNEGISRADVnXSGGVSMGEKDYLKQ 

V1X>IDIJHAQIHFGRVFMKPGLPTTFATLDIDGVR 

KIIFALPGNPVSAVVTCNLFWPALRKMQGILDP 

RPTmCARLSCDVKLDPRPEYHRCILTWHHQEPLP 

WAQSTGNQMSSRLMSMRSANQLLMLPPKTEQY 

VELHKGEWDVMVIGRL 


3820 


A 


2216 


487 


PQEPALKSEFSQVASNTIPLPLPQPNTCKDNGPCK 

QVCSTVGGSAICSCFPGYAIMADGVSCEDQDECL 

MGAHDCSRRQFCVNTLGSFYCVNHTVLCADGYI 

LNAHRKCVDINECVTDLHTCSRGEHCVNTLGSF 

HCYKALTCEPGYALKDGECEDVDECAMGTHTC 

QPGFLCQNTKGSFYCQARQRCMDGFLQDPEGNC 

VDINECTSLSEPCRPGFSCINTVGSYTCQRNPLIC 

ARGYHASDDGTKCVDVNECETGVHRCGEGQVC 

HNLPGSYRCDCKAGFQRDAFGRGCIDVNECWAS 

PGRLCQHTCENTLGSYRCSCASGFLLAADGKRC 

EDVNECEAQRCSQECANIYGSYQCYCRQGYQLA 

EDGHTCTDIDECAQGAGILCTFRCLNVPGSYQCA 

CPEQGYTMTANGRSCKDVDECALGTHNCSEAET 

CHNIQGSFRCLRFECPPNYVQVSKTKCBRTTCHD 

FLECQNSPARITHYQLNFQTGLLVPAHIFRIGPAP 

AFTGDnALNIIKGNEEGYFGTRRLNAYTGVVYL 

QRAVLEPRDFALDVEMKLWRQGSVTTFLAKMHI 

FFTTFAL 


3821 


A 


2216 


487 


PQEPALKSEFSQVASNTIPLPLPQPNTCKDNUl^K 

QVCSTVGGSAICSCFPGYAIMADGVSCEDQDECL 

MGAHDCSRRQFCVNTLGSFYCVNHTVLCADGYI 

LNAHRKCVDINECVTDLHTCSRGEHCVNTLGSF 

HCYKALTCEPGYALKDGECEDVDECAMGTHTC 

QPGFLCQNTKGSFYCQARQRCMDGFLQDPEGNC 

VDINECTSLSEPCRPGFSCINTVGSYTCQRNPLIC 

ARGYHASDDGTKCVDVNECETGVHRCGEGQVC 

ffiJLPGSYRCDCKAGFQRDAFGRGCIDVNECWAS 

PGRLCQHTCENTLGSYRCSCASGFLLAADGKRC 

EDVNECEAQRCSQECANIYGSYQCYCRQGYQLA 

EDGHTCTDIDECAQGAGILCTFRCLNVPGSYQCA 

CPEQGYTMTANGRSCKDVDECALGTHNCSEAET 

CHNIQGSFRCLRFECPPNYVQVSKTKCERTTCHD 

FLECQNSPARITHYQLNFQTGLLVPAHIFRIGPAP 

AFTGDTIALNIIKGNEEGYFGTRRLNAYTGWYL 

QRAVLEPRDFALDVEMKI^WKv^UDV i iri-Arayuxi 

FFTTFAL 


3822 


A 


2502 


1540 


' MAAATRGCRPWGSLLGLLGLVSAAAAAWDLAS 
LRCTLGAFCECDFRPDLPGLECDLAQHLAGQHL 
AKALVVKALKAFVRDPAPrEOPLVLSLHGWTGTG 
KSYVSSLLAHYLFQGGLRSPRVHHFSPVLHFPHP 
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SEQID 
NO: 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
add residue of 
peptide 

sequence 



Predicted end 
nucleotide 
location 
correspondiiie 
to last amino 
add residue of 
peptide 
sequence 



Amino add sequence (A=Alaninc C=Cysteinc, D^^Aspartic Add, 
C>=Glutamic Acid, F'=Pbcnylalaninc, G=<;iydne, H^Histidine, 
I=Isoleudne, K^'Lysine, I;=Lcudne, M=Methjonine, 
N^Asparagine, F-Proline, Q^Glntamine, RBArginine, S=Serine, 
T=Threoirine^ V^Valine, W=Tryptophan, Y^Tyrosinc, 
X»UDknawn» *=StDp codon^/'^ossible nudeotide deletion, 
V=possible nudeotide insertion 



SHIERYKKDLKSWVQGNLTACGRSLFLFDEMDK 

MPPGLIV!E\a.RPFLGSSWVWGTNYRKAIFIFISN 

TGGEQINQVALEAWRSRRDREEILLQELEPVISR 

AVLDNPHHGFSNSGIMEERLLDAWPFLPLQRHH 

VRHCYIiOELAQLGLEPRDEWQAVLDSTTFFPE 

DEQLFSSNGCKTVASRIAFFL 



3823 



3174 



YGCEKTTEGRIPLKNIYRLFSADRKRVETALEAC 

SLPSSR^IDSIPQEDFTPEVYRVFLNNLCPRPEIDNI 

FSEFGAKSKPYLTVDQMMDFINLKQRDPRLNEIL 

YPPLKQEQVQVLIEKYEPNNSLARKGQISVDGFM 

RYLSGEENGWSPEKLDLNEDMSQPLSHYFINSS 

HNTYLTAGQLAGNSSVEMYRQVLLSGCRCVELD 

CWKGRTAEEEPVITHGFTMTTEISFKEVIEAIAEC 

AFKTSPFPILLSFE>fflrva)SPKQQAKMAEyCRLIFG 

DALLMEPLEKYPLESGWLPSPMDLMYKILVKN 

KKKSHKSSEGSGKKKLSEQASNTYSDSSSMFEPS 

SPGAGEADTESDDDDDDDDCKKSSMDEGTAGSE 

AMATEEMSNLVNYIQPVKFESFEISKKRNKSFEM 

SSFVETKGLEQLTKSPVEFVEYNKMQLSRIYPKG 

TRVDSSNYMPQLFWNAGCQMVALNFQTMDLA 

MQINMGMYEYNGKSGYRLKPEFMRRPDKHFDP 

FTEGIVDGIVANTLSVKirSGQFLSDKKVGTYVEV 

DMFGLPVDTRRKAFKTKTSQGNAVNPVWEEEPI 

VFKKWLPTLACLRIAVYEEGGKFIGHRILPVQAI 

RPGYHYICLRNERNQPLTLPAVFVYIEVKDYVPD 

TYADVIEALSOTIRYYNLMEQRAKQLAALTLEDE 

EE\aCKEApPGETl>SEAPSEARl^PAENGVNHTTT 

LWKPPSQALHSQPAPGSVKAPAKTEDLIQSVLTE 

VEAQTOELKQQKSFVKLQKKHYKEMKDLVKR 

HHKKTTDLIKEHTTKYNEIQNDYLRRRAALEKS 

AKKDSKXKSEPSSPDHGSSTIEQDLAALDAEMTQ 

KLIDLKDKQQQQLLNLRQEQYYSEKYQKREHIK 

LLIQKLTOVAEECQNNQLKKLKEICEKEKKELKK 

KMDKKRQEKITEAKSKDKSQMEEEKTEMIRSYI 

QEWQYIKRLEEAQSKRQEKLVEKHKEIRQQILD 

EKPKLQVELEQEYQDKFKJRLPLEILEFVQEAMKG 

KISEDSNHGSAPLSLSSDPGKVNHKIPSSEELGGD 

BPGKEFDTPL 



3824 



426 



ILHWFS^WSGRNNREKIGVHVGFEEILNMEPY 

CCREILKSLRPECFIYDLSAVVMHHGKGFGSGH 

YTAYCYNSEGGFWVHCNDSKLSMCTMDEVCKA 

QAYILFYTQRVTENGHSKLLPPELLLGSQHPNED 

ADTSSNEILS 



3825 



364 



GIRAKFPNKIPVVVERYPRETFLPPLDKTKFLVPQ 
ELTMTQFLSIIRSRMVLRATEAFYLLVNNKSLVS 
MSATMAEIYRDYKDEDGFVYMTYASQETFGCLE 
SAAPRDGSSLEDRPLHPL 



3826 



1237 



PEKKFERECREAEKAQQSYERLDNDTNATKADV 

EKAKQQLNLRTHMADENKNEYAAQLQNFNGEQ 

HKHFYVVIPQIYKQLQEMDERRTIKLSECYRGFA 

DSERKVIPIISKCLEGMILAAKSVDEIIRDSQMVV 

DSFKSGFEPPGDFPFEDYSQHIYRTISDGTISASKQ 

ESGKMDAKTTVGKAKGKLWLFGKKPKGPALED 

FSHLPPEQRRIOaQQRIDELNRELQKESDQKDAL 

NKMKDVYEKNPQMGDPGSLQPKIJ^TM>Q^ 



444. 
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S£Qm 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location ' 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
seduence 


Amino acid sequence (A=Alanine C=Cysteinc, D=Aspartic Acid, 

jj^tjllltamiC AClOj jp— X iicnyi4i*iiiuiCj o^— vFij*'«"«^ ti xij»wu««.j 

I^Isoleucine, K=Lysine, Ir=Leucine, M=Methioiiine, 
N^Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T-Threonine, V=Valine, W=Tryptophan, Y^Tyrosine, 
X^UnknowD, *^top codon, ^=possible nucleotide deletion, 
\=possibIe nucleotide insertion 










LRMEIHKNEAWLSEVEGKTGGRGDRRHSSDINH 
LVTQGRESPEGSYTDDANQEVRGPPQQHGHHNE 
FDDEFEDDDPLPAIGHCmYPFDGHNEGTLAMK 
EGEVLYIIEEDKGDGWTRARRQNGEEGYVPTSYI 
DVTLEKNSKGS 


3827 


A 


2 


1584 


INPVSSAVNGEAHSSHETRGQNSNALPSVLLELL 

SQSCLPAMSSYLRNDSVLDMARHVPLYRALLEL 

LRAIASCAAMVPLLLPLSTENGEEEEEQSECQTS 

VGTLLAKMKTCVDTYTNRLRSKJtENVKTGVKP 

DASDQEPEGLTLLVPDIQKTAEIVYAATTSLRQA 

NQEKKLGEYSKKAAMKPKPLSVLKSLEEKYVAV 

MKKLQFDTFEMVSEDEDGKLGFKVNYHYMSQV 

KNANDANSAARARRLAQEAVTLSTSLPLSSSSSV 

FVRCDEERLDIMKVLITGPADTPYANGCFEFDVY 

FPQDYPSSPPLVNLETTGGHSVRFNPNLYNDGKV 

CLSBLNTWHGRPEEKWNPQTSSFLQVLVSVQSLI 

LVAEPYFNEPGYERSRGTPSGTQSSREYDGNIRQ 

ATVKWAMLEQIROTSPCFKEVIHKHFYLKRVEIM 

AQCEEWIADIQQYSSDKRVGRTMSHHAAALKRH 

TAQLREELLKLPCPEGLDPDTDDAPEVCRATTGA 

EETLMHDQVKPSSSKELPSDFQL 


3828 


A 


1415 


845 


PRVPATLVSLDPWHCFPTAGRLAGSTWVPPACT 

LQLGPSSEHELDNHRAPLLSLPSQESLSFTPWYLV 

ACKPLFHIFCPLFACFMQEGKVQYLFLHLSHMRL 

LNYYFFPFLAPESLMQALEDLDYLAALDNDGNL 

SEFGHMSEFPLPPQLSKSILASCEFDCVDEVLTIA 

AMYTGILNDYSFSFFANLH 


3829 


A 


199 


683 


VDHTPVLSKPQCFSSVKWGATLSARSQKTSGIGR 
LMVHVIEATELKACKPNGKSNPYCEISMGSQSYT 
TRTIQDILNPKWNFNCQFFIKDLYQDVLCLTLFD 
RDQFSPDDFLGRTEIPVAKIRTEQESKGPMTRRLL 
LHEVPTGEVWVRFDLQLFEQKTLL 


3830 


A 


1747 


404 


RKiVlMEESGIETTPPGTPPPNPAGLAATAMSSTPV 

PLAATSSFSSPNVSSMESFPPLAYSTPQPPLPPVRP 

SAPLPFVPPPAVPSVPPLVTSMPPPVSPSTAAAFG 

NPPVSHFPPSTSAPNTLLPAPPSGPPISGFSVGSTY 

DITRGHAGRAPQTPLMPSFSAPSGTGLLPTPITQQ 

ASLTSLAQGTGTTSAITFPEEQEDPRITRGQDEAS 

AGGIWGFIKGVAGNPMVKSVLDKTKHSVESMIT 

TLDPGMAPYIKSGGELDIVVTSNKEVKVAAVRD 

AFQEVFGLAVWGEAGQSNIAPQPVGYAAGLKG 

AQERIDSLRRTGVIHEKQTAVSVENFIAELLPDK 

WFDIGCLWEDPVHGIHLETFTQATPVPLEFVQQ 

AQSLTPQDYNLRWSGLLVTVGEVLEKSLLNVSR 

TOWHMAFTGMSRRQMn^SAARAUGMYKQRLP 

PRTV 


3831 


A 


5 


674 


FWTRSAWHEGLQQMKANDPSLQE\^YNIKN1P 

IPTLREFAKALETNTHVKKFSLAATRSNDPVAIAF 

ADMLKVNTTLTSLNIESHFITGTGILALVEALKEN 

DTLTEIKIDNQRQQLGTA VliMDlAl^Mi- JiJtiw ojul. 

KFGYQFIXQGPRTRVAAAITKNNDLAWQKDTQ 

EQTSIWQWSQSIAGFNPQFEVQGQNARSWMEE 

LGKAFHQFVRRELKQIEGKLP 


3832 


A 


164 


782 


EPWVPMDVAESPERDPHSPEDEEQPQGLSDDDIL 
RDSGSDQDLDGAGVRASDLEDEESAARGPSQEE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nudeotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, iNAspartic Acid, 
E=Glutamic Acid, F=Phcuy!alanine, G=Glycine, H=Histidine, 
I=Iso]eucjne, K=Lysine, L=Leudne, M^^Metbionine, 
N»Asparagine, P^ProIine, Q=Glutamine, R»Arginine, S=Serine, 
T»Threonine, V^Valine, W=Tryptophan, Y«Tyrosine, 
X^Unknown, *»Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










EDNHSDEEDRASEPKSQDQDSEVNELSRGPTSSP 

CEEEGDEGEEDRTSDLRDEASSVTRELDEHELDY 

DEEVPEEPAPAVQEDEAEKAGAEDDEEKGEGTP 

REEGKAGVQSVGEKESLEAAKEKKKEDDDGEID 

DEEMY 


3833 


A 


122 


1676 


SQPPHFTQKMNENKDTDSKKSEEYEDDFEKDLE 

WLINENEKSDASnEMACEKEENINQDLKENETV 

MEHTKRHSDPDKSLQDEVSPRRNDHSVPGIQPLD 

PISDSDSENSFQESKLESQKDLEEEEDEEVRRYIM 

EKIVQANKLLQNQEPVNDKRERKLKFKDQLVDL 

EWPLEDTTTSKhTYFENERNMFGKLSQLCISN^^ 

GQEDVLLSLTNGSCEENKDRTILVERDGKFELLN 

LQDIASQGFLPPINNANSTENDPQQLLPRSSNSSV 

SGTKKEDSTAKIHAVTHSSTGEPLAYIAQPPLNR 

KTCPSSAVNSDRSKGNGKSNHRTOSAHISPVTST 

YCLSPRQKELQKQLEEfOREBCLKREEERRKIEEEK 

EKKRE>roiVFKAWLQKKREQVLEMRW 

EDMNSRQENRDPQQAFRLWLKKKHEEQMKERQ 

TEELRKQEECLFFLKGTEGRERAFKQWLRRKRM 

EKMAEQQAVRERTRQLRLEAKRSKQLQHHLYM 

SEAKPFRFTDHYN 


3834 


A 


575 


774 


RSRTEELSNSGDLKAMSKDLVTFGDVA VNFSQEE 
WEWLNPAQRNLYRKVMLENYRSLVSLGKDMSP 


3835 


A ■ 


2 


100 


ASDFYLRYYVGHKGKFGHEFLEFEFRPDGVYV 


3836 


A 


91 


749 


RPTPGHGDFWMOPLTKJDAGMSLSSVTLASALOV 
. RGEALSEEEIWSLLFLAAEQLLEDLRNDSSDYVV 
^i:PWSALL'SAAGSLSFQGRVSHIEAAPFKAPELLQ 
GQSEDEQPDASQMHVYSLGMTLYWSAGFHVPP 
HQPLQLCEPLHSILLTMCEDQPHRRCTLQSVLEA 
CRVHEKEVSVYPAPAGLHERRLVGLVLGnSEVS 
REPCFSSSSCWSCVAIKI 


3837 


A 


3 


1214 


SLGCTNSARGKGQDDEVRTLMANGAPFTTDWFS 

KLRVSCGYIGDNCKNGADVNAKDMLKMTALH 

WATERHHRDVVELLIKYGADVHAFSKFbKSAFD 

lALEKNNAEILVILQEAMQNQVNVNPERANPVTD 

PVSMAAPFIFTSGEVVNLASLISSTNTKTTSGDPH 

ASTVQFSNSTTSVLATLAALAEASVPLSNSHRAT 

ANTEEnEGNSVDSSIQQVMGSGGQRVinVTDGV 

PLGNIQTSIPTGGIGHPFIVTVQDGQQVLTVPAGK 

VAEETVIKEEEEEKLPLTKKPRIGEKTNSVEESKE 

GNERELLQQQLQEANRRAQEYRHQLLKKEQEAE 

QYRLKLEAIARQQPNGVDFTMVEEVAEVDAW 

VTEGELEERETKVTGSAGATGPPTRVSMATVSS 


3838 


A 


1 


1332 


MIEDNKENKDHSLERGRASLIFSLKNEVGGLIKA 

LKIFQEKHVNLLHIESRKSKRRNSEFEIFVDCDIN 

REQLNDIFHLLKSHTNVLSVNLPDNFTLKEDGME 

TVPWFPKKISDLDHCANRVLMYGSELDADHPGF 

KDNVYRKRRKYFADIAMNYKHGDPDPKVEFTEE 

EIKTWGTVFQELNKLYPTHACREYLKNLPLLSKY 

CGYREDNIPQLEDVSNFLKERtGFSIRPVAGYLSP 

RDFLSGLAFRVFHCTQYVRHSSDPFYTPEPDTCH 

ELLGHVPLLAEPSFAQFSQEIGLASLGASEEAVQ 

KLATCYFFTVEFGLCKQDGQLRVFGAGLLSSISE 

UCHALSGHAKVKPFDPKITCKQECLITITQDVYF 

VSESFEDAKEKMREFTKTIKRPFGVKYNPYTRSI 
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SEQID 
NO: 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid re^due of 
peptide 
sequence 



Amino acid sequence (A^Alanine OCystcine, O^Aspartic Add, 
EXSlutamic Acid, F-Phcnylalanine, G«Glyclne, H=Histidine, 
I-lsoleudne, K^Lysine, I/=Leucine, M^Methionine, 
N-Asparagine, P=Proline, Q=Glutamine, R=Arg*mine, S=Serine, 
T<»Threonine, V=Va!ine, W=Tryptophan, Y^Tyrosine, 
X»Unknown, *=Stop codon, ^=p«>ssible nucleotide deletion, 
V-possible nucleotide insertion 



QILKDTKSITSAMNELQHDLDVVSDALAKVSRKF 
SI 



3839 



3093 



520 



MVNFTVDQIRAIMDKKANIRNMSVIAHVDHGKS 
TLTDSLVCKAGIIASARAGETRFTDTRKDEQERCI 
mSTAISLFYELSElSIDLNFIKQSKDGAGFLINLID 
SPGHVDFSSEVTAALRVTDGALVWDCVSGVCV 
QTETN^RQAIAERIKPVLMMNKMDRALL^^ 
PEELYQTFQRJVENVNVnSTYGEGESGPMGNIMI 
DPVLGTVGFGSGLHGWAFTLKQFAEMYVAKFA 
AKGEGQLGPAERAKKVEDMMKKLWGDRYFDP 
ANGKFSKSATSPEGKKLPRTFCQLILDPIFKVFDA 
IMNFKKEETAKLIEKLDIKLDSEDKDKEGKPLLK 
AVMRRWLPAGDALLQMITIHLPSPVTAQKYRCE 
LLYEGPPDDEAAMGIKSCDPKGPLMMYISKMVP 
TSDtCGRFYAFGRVFSGLVSTGLKVRIMGPNYTPG 
KKEDLYLKPIQRTILMMGRYVEPIEDVPCGNIVG 
LVGVDQFLVKTGTITTFEHAHNMRVMKFSVSPV 
VRVAVEAKNPADLPKLVEGLKRLAKSDPMVQCI 
IBESGEHIIAGAGELHLEICLKDLEEDHACIPIKKS 
DPVVSYREWSEESNVLCLSKSPNKHNRLYMKA 
RPFPDGLAEDIDKGEVSARQELKQRARYLAEKY 
EWDVAEARKIWCFGPDGTGPNILTDITKGVQYL 
NEIKDSWAGFQWATKEGALCEENMRGVRFDV 
HDVTLHADAIHRGGGQnPTARRCLYASVLTAQP 
RLMEPIYLVEIQCPEQWGGIYGVLNRKRGHVFE 
ESQVAGTPMFVVKAYLPVNESFGFTADLRSNTG 
GQAFPQCVFDHWJ3ILPGDPFDNSSRPSQVVAETR 
KRKGLKEGIPALDHFLDKL 



3840 



753 



SSTRSRDFCCSEAIQGSLTRRERRASGVRTRRSQU 
SSAMASKILLNVQEEVTCPICLELLTEPLSLDCGH 
SLCRACITVSNKEAVTSMGGKSSCPVCGISYSFE 
HLQANQfflJUWERLKEVKLSPDNGKJOODLCDH 
HGEKLLLFCKEDRKVICWLCERSQEHRGHHTVL 
TEEVFKECQEKLQAVLKRLKKEEEEAEKUBADIR 
EEKTSWKYQVQTERQRIQTEFDQLRSILl^NEEQR 
ELQRLEEEEKKT 



3841 



405 



GKAFSCFTYI^SQHRRTHMAEKPYECKTCKKAFS 
HFGNLKVHERIHTGEKPYECKECRKAFSWLTCL 
LRHERIHTGKKSYECQQCGKAFTRSRFLRGHEKT 
HTGEKMHECKECGKALSSLSSLHRHKRTHWRDT 



'3842 



311 



88 



3843 



1175 



AVLKNMAPMTALGLLDLHILNLELFLSAGEDFTS 
WSEIMMYILLWLTLWLLffiMIYCyR^ 
AAQENA 



APIRNSRIDDFVRRVESKAl^ARCGLWGSGPRRR 
PASGMFRGLSSWLGLQQPVAGGGQPNGDAPPEQ 
PSETVAESAEEELQQAGDQELLHQAKDFGNYLF 
OTASAATmiESVAETAQTIKKSVEEGKIDGIID 
KTnGDFQKEQKKFVEEQHTKKSEAAVPPWVDT 
NDEETIQQQILALSADBKNFLRDPPAGVQFNFDF 
DQMYPVALVMLQEDELLSKMRFALVPKLVKEE 
VFWRNYFYRVSLIKQSAQLTALAAQQQAAGKEE 
KSNGREQDLPLAEAVRPKTPPWIKSQLKTQEDE 
EEISTSPGVSEFVSDAFDACNLNQEDLRKEMEQL 
VLDKKQEETAVLEEDSADWEKELQQELQEYEV 
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SEQID 
NO: 


Method 


Predicted 

begioning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
seqaence 


Amino acid sequence (A=Alanioe C=Cysteine, D=>Aspartic Add, 
£=^Iutamic Acid, F^Pfaenylalam'ne, G=Glycine, H-Histidine, 
I=Isoleucine, K=Lysine, I^Leucine, M'=Methionine, 
N=Aspdragine, P=ProIine, Q=<;iotamine, R-Arginine, S=Serine, 
T=Tbreonine, V=Valine, W=Tryptophan, Y«Tyrosinc, 
X^Unknown, *=^tDp codon, /^possible nucleotide deletion^ 
V=possible nucleotide insertion 










VTESEKRDENWDKEIEKMLQEEN 




A 






T PPADTPT7AWT T T AlsrVVVVT TT VPT KTlPT rnpf T T 

RCKLLPSALQKMALGMFFGFTSVIVAGVLEMER 

LHYIHHNETVSQQIGEVLYNAAPLSIWWQIPQYL 

LIGISEIFASIPGLEFAYSEAPRSMQGAIMGIFFCLS 

GVGSLLGSSLVALLSLPGGWLHCPKDFGNINNCR 

MDLYFFLLAGIQAVTALLFVWIAGRYERASQGP 

ASHSRFSRDRG 


3845 


A 


3 


1934 


PEDSAPQYSRLFPNASQHITPSYNYAPNPDKHWI 

MRYTGPMKPIHMEFTNMLQRKRLQTLMSVDDS 

METIYNMLVETGELDNTYIVYTADHGYfflGQFG 

LVKGKSMPYEFDIRVPFYVRGPNVEAGCLNPmV 

LNIDLAPTILDIAGLDIPADMDGKSILKLLDTERP 

VNRFHLKKKMRVWRDSFLVERGKLLHKRDl^^ 

VDAQEENFLPKYQRVKDLCQRAEYQTACEQLG 

QKWQCVEDATGKLKLHKCKGPMRLGGSRALSN 

LVPKYYGQGSEACTCDSGDYKLSLAGRRKKLFK 

KKYKASYVRSRSIRSVAIEVDGRVYHVGLGDAA 

QPRNLTKRHWPGAPEDQDDKDGGDFSGTGGLP 

DYSAANPIKVraRCYILENDTVQCDLDLYKSLQ 

AWKDHKLHmHEffiTLQNKIKl^REVRGHLKkK 

JKJriiJlCUL»xll\io I rl 1 v^HJvUivLJU^KOool^rU'r KJvurJL 

QEKDKVWLLREQKRKKKLRKLLKRLQNNDTCS 

MPGLTCFTHDNQHWQTAPFWTLGPFCACTSAN 

NNTYWCMRTINETHNFLFCEFATGFLEYFDLNT 

DPYQLMNAVNTLDRDVLNQLHVQLMELRSCKG 

YKQCNPRTT0^I:GL#)GG 

PEMKRPSSKSLGQLWEGWEG 


3846 


A 


3 


1934 


PEDSAPQYSRLFPNASQHITPSYNYAPNPDKHWI 

MRYTGPMKPIHMEFmMLQRKIlLQTLMSVDDS 

METIYNMLVETGELDNTYIVYTADHGYHIGQFG 

LVKGKSMPYEFDIRVPFYVRGPNVEAGCLNPHIV 

LNIDLAPTDLDIAGLDIPADMDGKSILKLLDTERP 

VNRFHLKKKMRVWRDSFLVERGKLLHKRDNDK 

VDAQEENFLPKYQRVKDLCQRAEYQTACEQLG 

QKWQCVEDATGKLKLHKCKGPMRLGGSRALSN 

LVPKYYGQGSEACTCDSGDYKLSLAGRRKKLFK 

KKYKASYVRSRSIRSVAIEVDGRVYHVGLGDAA 

QPRNLTKRHWPGAPEDQDDKDGGDFSGTGGLP 

DYSAANPIKVTHRCYILENDTVQCDLDLYKSLQ 

AWKDHKXHIDHEffiTLQNKJKNLREVRGHLKK^ 

QEKDKVWLLREQKRKKKLRKIXKRLQNNDTCS 

MPGLTCFTHDNQHWQTAPFWTLGPFCACTSAN 

NNTYWCMRTINETHNFLFCEFATGFLEYFDLNT 

DPYQLMNAVNTLDRDVLNQLHVQLMBLRSCKG 

YKQCNPRTRNMDLGLKDGGSYEQYRQFQRRKW 

PEMKRPSSKSLGQLWEGWEG 


3847 


A 


1 


1257 


NIWSAVLTAFHTGTSNTTFVVYENTYMNTTLPPP 
FQHPDLSPLLRYSFETMAPTGLSSLTVNSTAVPTT 
PAAFKSLNLPLQITLSAIMIFILFVSFLGNLWCLM 
VYQKAAMRSAINILLASLAFADMLLAVLNMPFA 
LVTELTTRWIFGKFFCRVSAMFFWLFVIEGVAILL 
nSIDRFLIIVQRQDKLNPYRAKVLIAVSWATSFCV 
AFPLAVGNPDLQJPSRAPQCVFGYTTNPGYQAYV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine 0=Cysteine, D^Aspartic Acid, 
E»GIutamic Add, F==Phenylalanine, G^GIycine, U»Histidine, 
I»Isoleudne, K«Lysine, IHLiCudne, M=Metfaionine, 
I>^»Asparagine, P«ProUne, Q=Glutaminc, R=«Arginine, S=Serinc, 
l^Threomne, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X^Unlcnown, *=Stop codon, /-possible nudeotide deletion, 
V^ossiUe nudeotide insertion 










ELISLISFFBPFLVILYSFMGILNTLRHNALRIHSYPE 

GICLSQASKLGLMGLQRPFQMSIDMGFKTRAFTT 

ILILFAWWCWAPFTTYSLVATFSKHFYYQHNFF 

EISTWLLWLCYLKSALNPLIYYWRIKKFHDACLD 

MMPKSFKFLPQIJ>GHTKRRIRPSAVYVCGEHRT 

W 


3848 


A 


3 


2827 


SSAVAAIOOaiSWASLVLAFLGVCLGITLAVDRS 

NFKTCEESSFCKRQRSIRPGLSPYRALLDSLQLGP 

DSLTVHLIHEVTKVLLVLELQGLQKNMTRFRIDE 

LEPRRPRYRVPDVLVADPPIARLSVSGRDENSVE 

LTMAEGPYKIILTARPFRLDLLEDRSLLLSVNARG 

LLEFEHQRAPRVSQGSKDPAEGDGAQPEETPRD 

GDKPEETQGKAEKDEPGAWEETFKTHSDSKPYG 

PMSVGLDFSLPGMEHVYGBPEHADNLRLKVTEG 

GEPYRLYNLDVFQYELYNPMALYGSVPVLLAHN 

PHRDLGIFWLNAAETWVDISSNTAGKTLFGKMM 

DYLQGSGETPQTDVRWMSETGIIDVFLLLGPSISD 

VFRQYASLTGTQALPPLFSLGYHQSRWNYRDEA 

DVLEVDQGFDDHNLPCDVIWLDIEHADGKRYFT 

WDPSRFPQPRTMLERLASKRRKLVAIVDPHIKVD 

SGYRVHEELRlSnLGLYVKTRDGSDYEGWCWPGS 

AGYPDFTNPTMRAWWANMFSYDNYEGSAPNLF 

VWNDMNEPSVFNGPEVTMLKDAQHYGGWEHR 

DVHNIYGLYVHMATADGLRQRSGGMERPFVLA 

RAFFAGSQRFGAVWTGDNTAEWDHLKISIPMCL 

SLGLVGLSFCGADVGGFFKNPEPELLVRWYQMG 

AYQPFFRAHAHLDTGRREPWLLPSQHNDIIRDAL 

GQRYSLLPF\VYTLLYQAHREGIPVMRPLWVQYP 

QDVTTFNIDDQYLLGDALLVHPVSDSGAHGVQV 

YLPGQGEVWYDIQSYQKHHGPQTLYLPVTLSSIP 

VFQRGGTIVPRWMRVRRSSECMKDDPITLFVALS 

PQGTAQGELFLDDGHTFNYQTRQEFLLRRFSFSG 

NTLVSSSADPEGHFETPIWffiRVVnGAGKPAAW 

LQTKGSPESRLSFQHDPETSVLVLJIKPGINVASD 

WSIHLR 


3849 


A 


1 


1717 


RARNARGCWGVCRSGFSSAVCGAARMEQVAEG 

ARVTAVPVSAADSTEELAEYEEGVGVVGEDNDA 

AARGAEAFGDSEEDGEDVFEVEKILDMKTEGGK 

VLYKVRWKGYTSDDDTWEPEIHLEDCKEVLLEF 

RKIOAENBCAKAVRKDIQRLSLNNDIFEANSDSDQ 

QSETKEDTSPKKKKJOXRQREEKSPDDLKKKKA 

KAGKLKDKSKPDLESSLESLVFDLRTKKRISEAK 

EELKESKKPKKDEVKETKELKKVKKGEIRDLKT 

KTREDPKEl^TKKEKFVESQVESESSVLNDSPF 

PEDDSEGLHSDSREEKQNTKSARERAGQDMGLE 

HGFEKPLDSAMSAEEDTDVRGRRKKKTPRKAED 

TRENRKLENKNAFLEKKTVPKKQRNQDRSKSAA 

ELEKLMPVSAQTPKGRRLSGEERGLWSTDSAEE 

DKETKRlSnBSKKPKKDEVKETKELKKVKKGEira 

LKTKTREDPKJiN Klv 1 JvJsJcilvr V Hov^ v ivocoo v JjIN u 

SPFPEDDSEGLHSDSREEKQNTKSARERAGQDM 

GLEHGFEKPLDSAMSAEEDTDVRGRRKKKTPRK 

AEDTRENRKLENKNAFLEKKTVPKKQRNQDRSK 

SAAELEKLMPVSAQTPKGRRLSGEERGLWSTDS 

AEEDiOETKRlSrESKKPKKDEVKETKELKKV^ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Hrst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine CK^steine, D^Aspartic Acid, 
£>=GIutamic Acid, F-Phenylalanine, G=Glycine, H^Histidine, 
I=Isoieudne, K^^Lysine, L^Lcucine, M-Methionine, 
N=Asparagine, P=ProliDC, Q=Glutaniine, K-Arginine, S=Scrine, 
T=Tbreonine, V«VaIine, W«Tryptophan, Y=Tyrosine, 
X=UnknowD, *aStop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










IRDLKTKTREDPKENKKTKKEKFVESQVESESSV 
LNDSPFPED/RQ*RATFRQQREEKSPDDLKKKKA 
KAGKLKDKSKPDLESSLESLVFDLRTBCKRISEAK 
EELKESKKPK 


3850 


A 


1113 


3975 


PAAAAAAAAAAAAAAGRGPSFTPCFSPSLAVEPS 

RRTRLGSDPAQAMAGNVKKSSGAGGGSGSGGS 

GSGGLIGLMKDAFQPHHHHHHHLSPHPPGTVDK 

KMVEKCWKLMDKVVRLCQNPKLALKNSPPYIL 

DLLPDTYQHLRmSRYEGKMETLGENEYFRVF 

MENLMKKTKQTISLFKEGKERMYEENSQPRRNL 

TKLSLIFSHMLAELKGIFPSGLFQGDTFRITKADA 

AEFWRKAFGEKTIVPWKSFRQALHEVHPISSGLE 

AMALKSTIDLTCNDYISVFEFDIFTRLFQPWSSLL 

RlNTVWSLAVTHPGYMAFLTYDEVKARLQKFIHKP 

GSYIFRLSCTRLGQWAIGYVTADGNILQTIPHNKP 

LFQALIDGFREGFYLFPDGRNQNPDLTGLCEPTP 

QDHIKVTQEQYELYCEMGSTFQLCKICAENDKD 

VKIEPCGHLMCTSCLTSWQESEGQGCPFCRCEIK 

GTEPIWDPFDPRGSGSLLRQGAEGAPSPNYDDD 

DDERADDTLFMMKELAGAKVERPPSPFSMAPQA 

SLPPVPPRLDLLPQRVCVPSSASALGTASKAASGS 

LHKDKPLPVPPTLRDLPPPPPPDRPYSVGAESRPQ 

RRPLPCTPGDCPSRDKLPPVPSSRLGDSWLPRPIP 

KVPVSAPSSSDPWTGRELTNRHSLPFSLPSQMEP 

RPDVPRLGSTFSLDTSMSMNSSPLVGPECDHPKI 

KPSSSANAIYSLAARPLPVPKLPPGEQCEGEEDTE 

YMTPSSRPLRPLDTSOSSRACDCDOOIDSCTYEA 

MYMQSQAPSITESSTFGEGNLAAAHAOTGPE^^ 

ENEDDGYDVPKPPVPAVLARRTLSDISNASSS/FG 

LFVLERDP'PQNVTEGSQVPERPPKPFPRRINSER 

KAGSCQQGSGPAASAATA\SPQLSSEIENLMSQG 

YSYQDIQKALVIAQNNIEMAKNELREFVSISSPAH 

VAT 


3851 


A 


2 


2781 


GRVGSMDGAMGPRGLLLCMYLVSLLILQAMPA 

LGSATGRSKSSEKRQAVDTAVDGVFIRSLKVNC 

KVTSRFAHYWTSQVVNTANEAREVAFDLEIPK 

TAnSDFAVTADGNAFIGDIKDKVTAWKQYRKA 

AISGENAGLVRASGRTMEQFTIHLTVNPQSKVTF 

QLTYEEVLKENEJMQYEIVKVKPKQLVHHFEIDV 

DIFEPQGISKLDAQASFLPKELAAQTIKKSFSGKK 

GHVLFRPTVSQQQSCPTCSTSLLNGHFKVTYDVS 

RDKICDLLVANNHFAHFFAPQNLTNMNKNVVFV 

IDISGSMRGQKVKQTKEALLKILGDMQPGDYFD 

LVLFGTRVQSWKGSLVQASEANLQAAQDFVRGF 

SLDEATNLNGGLLRGIEILNQVQESLPELSNHASI 

LIMLTDGDPTEGVTDRSQILKNVRNAIRGRFPLY 

NLGFGHNVDFNFLEVMSMENNGRAQRIYEDHD 

ATQQLQGFYSQVAKPLLVDVDLQYPQDAVLALT 

QNHHKQYYEGSEIWAGRIADNKQSSFKADVQA 

HGEGQEFSITCLVDEEEMKKLLRERGHMLENHV 

ERLWAYLTIQELIj^^KRMKVDREVRAl^SSQAL 

MSLDYGFVTPLTSMSIRGMADQDGLKPTIDKPSE 

DSPPLEMLGPRRTFVLSALQPSPTHSSSNTQRLPD 

RVTGVDTDPHFIIHVPQKEDTLCF^fINEEPGVILS 

LVQDPNTGFSVNGQLIGNKARSPGQHDGTYFGR . 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locatiott 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc C=Cysteine, D=Aspartic Acid, 

T^/^Iiifomii* AriH ir=Pti«^nvtftlon{n»- f^gfilveiniL HaiHigtidilH*- 

Msoleucine, K-Lysine, L^Leuctoe, M=Methionine, 
N=sAsparagiDC,F^ProIinei Q^GIutamlne, R=>Arginitte, S=^erine, 
T=Threonine, V-Valine, W=Tryptophan, Y^Tyrosine, 
X^Unl&nowni *«Stop codon^/^possible nucleotide deletion^ 
V^ossiblc nucleotide insertion 










LGIANPATDFQLEVTPQWTLNPGFGGPVFSWKD 

QAVLRQDGVWTINKKRNLVVSVDDGGTF\EVV\ 

LHRVWXKGSSWHQDFLGLLMCWDKSIGMSSPGR 

KGCWGQ\FFHPIRFLKVS*HPPPGSDPQKAQMPT 

MVVKf^PGLTVmGLQKX)YSKI)PWHGAEVSC 

WFI\HNNGA*I\TDCAYTDYI\VPDIF 


3852 


A 


39 


1735 


TQVAEAGRGEGWAGAETGRPQSAGMNLELLES 

FGQNYPEEADGTLDCISMALTCTFNRWGTLLAV 

GCNDGRIVIWJ)F\LTRGIA*NKFSAHIHPVCSLC 

WSRDGHKLVSASTDNIVSQWDVLSGDCDQRFRF 

PSPILKVQYHPRDQ>nECVLVCPMKSAPVMLTLSD 

SKHWLPVDDDSDLNWASFDRRGEYIYTGNAK 

GKILVLKTDSQDLVASFRVTTGTSNTTAIKSIEFA 

RKGSCFLINTADRIIRVYDGREILTCGRDGEPEPM 

QKLQDLVNRTPWKKCCFSGDGEYIVAGSARQH 

ALYIWEKSIGNLVKILHGTRGELLLDVAWHPVRP 

IIASISSGWSIWAQNQVENWSAFAPDFKELDEN 

VEYEERESEFDIEDEDKSEPEQTGADAAEDEEVD 

VTSVDPIAAFCSSDEELEDSKALLYLPIAPEVEDP 

EENPYGPPPDAVQTSLMDEGASSEKKRQSSADG 

SQPPKKKPKTTNIELQGVPNDEVHPLLGVKGDG 

KSKKKQAGRPKGSKGKEKDSPFKPKLYKGDRGL 

PLEGSAKGKVQAELSQPLTAGGAISELL 


3853 


A 


45 


2603 


PLLFTCGREVRARDPEKEGTIVVAGLKVQVQPRF 

LWILCFSMEETQGELTSSCGSKTMANVSLAFRDV 

SIDLSQEEWECLDAVQRDLYKDVMLENYSNLVS 

LDLEYKYITKNLLSEK^^CKIYLSQLQTGEKSKN 

TIHEDTIFRNGLQCKHEFERQERHQMGCVSQMLI 

QKQISHPLHPKIHAREKSYECKECRKAFRQQSYLI 

QHLRIHTGERPYKCMECGKAFCRVGDLRVHHTI 

HAGERPYECKECGBCAFRLHYHLTEHQRIHSGVK 

PYECKECGKAFSRVRDLRVHQTIHAGERPYECK 

ECGKAFRLHYQLTEHQRIHTGERPYECKVCGKT 

FRVQRHISQHQKIHTGVKPYKCNECGKAFSHGS 

YLVQHQKIHTGEKPYECKECGKSFSFHAELARH 

RRJHTGEKPYECRECGKAFRLQTELTRHHRTHTG 

EKPYECKECGKAFICGYQLTLHLRTHTGEIPYEC 

KECGKTFSSRYHLTQHYRIHTGEKPYICNECGKA 

FRLQGELTRHHRIHTCEKPYECKECGKAFIHSNQ 

FISHQRIHTSESTYICKECGKIFSRRYNLTQHFKIH 

TGEKPYICNECGKAFRFQTELTQHHRIHTGEKPY 

KCTECGKAFIRSTHLTQHHRIHTGEKPYECTECG 

KTFSRHYHLTQHHRGHTGEKPYICNECGNAFICS 

YRLTLHQRIHTGELPYECKECGKTFSRRYHLTQH 

FRLHTGEKPYSCKECGNAFRLQAELTRHinVHTG 

EKPYKCKECGKAFSVNSELTRHHRIHTGEKPYQC 

KECGKAFIRSDQLTLHQ\KIILVR\NPMHNVKRIR 

WPLENAL*QRICmRNFLFVTEHVGIPFTSCSQFI 

RNYFVC 


3854 


A 


108 


894 


LQSCWVPGIPWPSVGWLSWLKiJLl'bdimsASi^a 
AVLQGPQCSEMLWPKNLTSWDDSSSVSSGISDTI 
DNLSTDDINTSSSISSYANTPASSRKNLDVQTDAE 
KHSQVERNSLWSGDDVKKSDGGSDSGIKMEPGS 
KWRRNPSDVSDESDKSTSGKKNPVISQTGSWRR 
GMTAQVGITMPRTKASAPAGALKTPGTGKRPGL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A^Alantoe C=Cysteioe, I>==Aspartic Acid, 
£==GIutamic Add, F^Fhenylalanine, G=Glydne, H=Histidine, 
I=Isoleucine, K=Lysine, I^Leudne, M=Methionine, 
N=Asparaginc, P=Prolinc, Q=Glutamiae, R=Argiiiine, S=Serine, 
T=Threoninc V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *^top codon, /"^possible nudeotide deletion, 
\^ossib)e nucleotide insertion 










S\GPGAPTPAAPPQLARMAWAFSLSAASTPAVSP 
STSPSAVEGSPATILPLASSPPPRTTP*LPLSELTV* 
RPQELVRGRGCLGPGAPTPAAPPQLARMAWAFS 
LSAASTPAVSPSTSPSAVEGSPATELPLASSPPPRT 
TP 


3855 


A 


1 


772 


FRGGDGAPGVLKPGNPLPFPLPPLQYPPPSTLSHS 

VGKSSLVLRFVKGQFHEYQESTIGAAFLTQSVCL 

DDTTVKFEIWDTAGQERYHSLAPMYYRGAQAAI 

WYDITNQETFARAKTWVKELQRQASP\SIWGL 

AGNKADLANKIIMVEYEEAQAYADDNSLLFMET 

SAKTAMhrVKDLFL\AIA»EVAKRVNPQNLG\G\A 

AGRSRGVDLHEQSVQQNKSQCCSN 


3856 

'J V 


A 


2815 


352 


LGLEAAARPRPGGPAAMQDGNFLLSALQPEAGV 

CSLALPSDLQLDRRGAEGPEAERLRAARVQEQV 

RARLLQLGQQPRHNGAAEPEPEAETARGTSRGQ 

YHTLQAGFSSRSQGLSGDKTSGFRPIAKPAYSPA 

SWSSRSAVDLSCSRRLSSAHNGGSAFGAAGYGG 

AQPTPPMPTRPVSFJIERGGVGSRADYDTLSLRSL 

RLGPGGLDDRYSLVSEQLEPAATSTYRAFAYER 

QASSSSSRAGGLDWPEATEVSPSRTIRAPAVRTL 

QRFQSSHRSRGVGGAVPGAVLEPVARAPSVRSLS 

LSLADSGHLPDVHGFNSYGSHRTLQRLSSGFDDI 

DLPSAVKYLMASDPNLQVLGAAYIQHKCYSDAA 

AKKQARSLQAVPRLVKLFNHANQEVQRHATGA 

. MRNLIYDNADNKLAL YEENGIFELLRTLREQDDE 

' LRKJm^Gtt^Vl^LSSSbB^ 
LGV*APLSGAGGPP\LIQQNASEAEIFYNATGFPR 
NLSSASQATRQKMRECHGLVDALVTSINHALDA 
GKCEDKSVENAVCVLRNLSYRLYDEMPPSALQR 
LEGRGRRDLAGAPPGEWGCFTPQSRRLRELPLA 

ATIAT TPAFV^VriPlf riT "PWT WWHIVnT VMPT T O 
J\LJJ\Lt X rj\jjt V OAX/Jr J^VJljiD W W OMTS^l V VJJ^ X rN JVLfJL/V^ 

RCELNRHTTEAAAGALQNITGGVDPRGPGGLSRL 
ALEQERILNPLLDRVRTADHHQLRSLTGLIRNLS 

RNARNKDEMSTKVV\SHLI\EKLPGSVGEKSPPAE 
\a.VmiAVFNNLGWLASPI/ALARDLLYFDGLRK 
LIFIKKKRDSPDSEKSSRAASSLLANLWQYhJKLH 
RDFRAKGYRKEDFLGP 


3857 


A 


1034 


204 


VAVTLLSQLPSAIQRTAAWEMRAPLTFRVPLALD 
LIKPEHCTVNVDNSLSIPVIAAELVVRKPSEKGM 

MFALPLK*PVTAAFHDSSMPSSLLQIEMEQLFLE 

ARLQ/PDSBCSEARRNQCDSMLLRNQQLCSTCQE 

MKMVQPRTMKIPDDPKASFENCMSYRMSLHQP 

KFQTTPEPFHDDIPTENIHLQNL/PILGPRTAVFHG 

LLTEAYKTLKERQRSSLPRKEPIGKTTEAVSGRSS 

SPPRLPERK 


3858 


A 


203 


3469 


SHQEBEQNSAMAPRKRGGRGISFIFCCFRNNDHPE 

ITYRLRITOSNFALQTMEPALPMPPVEELDVMFSE 

LVDELDLTDKHREAMFALPAEKXWQIYCSKKK 

DQEENKGATSWPEFYIDQLNSMAARKSLLALEK 

EEEEERSKTIESLKTALRTKPMRFVTRFIDLDGLS 

CILNFLKTMDYETSESRfflTSLIGCIKALMNNSQG 

RAHVLAHSESINVIAQSLSTENIKTKVAVLEILGA 

VCLWGGHKKVLQAMLHYQKYASERTRFQTLIN 
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NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alaninc C=Cysteine, I>=Aspartic Acid, 

Msoleucine, K=Lysinc, L=Leucine, M=Mcthlonlnc, 
N-Asparagine^ F^Proline, Q^utamine, R-Arginine, S=Serine, 
T^^hreoDine, V^Valine, W^^iyptophan, Y=Tyrosine, 
X^^UnknoTVD, *=Stop codon, /^possible nndeotide deletion, 
\»possible nucleotide Insertion 










DLDKSTGRYRDEVSLKTAIMSFINAVLSQGAGVE 

SLDFRLHLRYE\FLMLGIHPVMDKLRKHENSTLD 

RmDFFEMLRNEDELEFAKRFELVHIDTKSATQM 

FELTRKRLTHSEAYPHFMSILHHCLQMPYKRSGN 

TVQYWLLLDRHQQIVIQNDKGQDPDSTPLENFNI 

KhTVVRMLVNENEVKQWKEQAEmRKEHNELQ 

QKLEKKERECDAKTQEKEEMMQTLNKMKEKLB 

KETTEHKQVKQQVADLTAQLHELSRRAVCASIP 

GGPSPGAPGGPFPSSVPGSLLPPPPPPPLPGGMLPP 

PPPPLPPGGPPPPPGPPPLGAIMPPPGAPMGLALK 

KKSIPQPTNALKSFNWSKLPENKLEGTVWTEIDD 

TKVFOLDLEDLERTFSAYQRQQDFFVNSNSKQK 

EADAIDDTLSSKLKVKELSVIDGRRAQNCNILLS 

RLKLS>nDEIKRAILTMDEQEDLPKDMLEQLlJKW 

PEKSDmLLEEHKHELDRMAKADRFLFEMSRINH 

YQQRLQSLYFKKKFAERVAEVKPKVEAIRSGSEE 

VFRSGALKQLLEWLAFGNYMNKGQRGNAYGF 

KISSL^^KIADTKSSIDK^^TLLHYLITIVENKYPSV 

LNLNEHLRDIPQAAKVNMTELDKEISTLRSGLKA 

VETELEYQKSQPPQPGDKFVSVVSQFITVASFSFS 

DVEDLLAEAKDLFTKAVKHFGEEAGKIQPDEFF 

GIFDQFLQAVSEAKQENENMRKKKEEEERRARM 

EAQLKEQRERERKMRKAKENSEESGEFDDLVSA 

LRSGEVFDKDLSKLKRNRKRITNQMTDSSRERPI 

TKLNF 


3859 


A 


1279 


141 

■1 • - • 1 -~ - , » ^ 


RVEHLSEFLVDIKPSLTFDVIPLLDPYGPAGSDPS 

LEFLWSEETYRGGMAINRFRLENDLEELALYQI 

QLLKDLRHTENEEDKVSSSSFRQRMLGNLLRP^ 

ERPELPTCLYVIGLTGISGSGKSSIAQRLKGLGAF 

VIDSDHLGHRAYAPGGPAYQPVVEAFGTDILHK 

DGIIMIKVLGSRWGI^QLKILTDIMWPIIAKLA 

REEMDRAVAEGKRVCVIDAAVLLEAGWQNLVH 

EVWTAVIPETEAVRRIVERDGLSEAAAQSRLQSQ 

MSGQQLVEQSHWLS'nCGSRISPNARWRKPGPS 

CRSAFPRLIRPSTEKFSVGPDWLLELTSDPWRRN 

GGLDAHPGSGPEVQAILCRTWPGLVDTGSLPNTL 

VFGQH 


3860 


A 


1 


3881 


MGQKSVGASYVQIPLVPPLSRHPKGLGHEDRWS 

SYCLSSLAAQNICTSKLHCPAAPEHTDPSEPRGSV 

SCCSLLRGLSSGWSSPLLPAPVCNPNKAIFTVDA 

KTTEELVANDKACGLLGYSSQDLIGQKLTQFFLR 

SDSDVVEALSEEHl^ADGHAAWFGTVVDnSRS 

GEKIPVSVWMKRMRQERRLCCVVVLEPVERVST 

WVAFQSDGTVTSCDSLFAHLHGYVSGEDVAGQ 

HITDLIPSVQLPPSGQHIPKNLKIQRSVGRARDGT 

TFPLSLKLKSQPSSEEATTGEAAPVSGYRASVWV 

FCTISGLITLLPDGTIHGINHSFALTLFGYGKTELL 

GKNITFLIPGFYSYMDLAYNSSLQLPDLASCLDV 

GNESGCGERTLDPWQGQDPAEGGQDPRINWLA 

GGHNTSnPRDEIRKLMESQDIFrGTQTELlAGGQLL 

SCLSPQPAPGVDNVPEGSLPVHGEQALPKDQQIT 

ALGREEPVAIESPGQDLLGESRSEPVDVKPFASCE 

DSEAPVPAEDGGSDAGMCGLCQKAQLERMGVS 

GPSGSDLWAGAAVAKPQAKGQLAGGSLLMHCP 

CYGSEWGLWWSQDLAPSPSGMAGLSFGTPTLD 
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SEQID 
NO: 


Method 


Predicted 

beginDiog 

nucleotide 

location 

corresponding 

to first amioo 

acid residne of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=»Alanine C=Cysteinc, l>=Aspartic Acid, 
£>Clutamic Add, F=Phenylalanine, G=Glycine, H^Histidlne, 
I"Isoleudne, K=Lysinc, L^Lcudne, M=Methionine, 
^NA5pa^ag^ne, I^Froiine, Q=Giutaminc, R=Arf inine, S=Serine, 
T-Threonine, V=Va!inc W=Tryptophan, Y=Tyrosine, 
X^nknowDy *°Stop codoD, /=p<^^c nudeotide deletion, 
\=possible nucleotide insertion 




0 






EPWLGVENDREELQTCLIKEQLSQLSLAGALDVP 

HAELVPTECQAVTAPVSSCDLGGRDLCGGCTGS 

SSACYALATDLPGGLEAVEAQEVDVNSFSWNLK 

ELFFSDQTDQTSSNCSCATSELRETPSSLAVGSDP 

DVGSLQEQGSCVLDDRELLLLTGTCVDLGQGRR 

FRESCVGHDPTEPLEVCLVSSEHYAASDRESPGH 

VPSTLDAGPEDTCPSAEEPRLNVQVTSTPVIVMR 

GAAGLQREIQEGAYSGSCYHRDGLRLSIQFEVRR 

VELQGPTPLFCCWLVKDLLHSQRDSAARTRLFL 

ASLPGSTHSTAAELTGPSLVEVLRARPWFEEPPK 

AVELEGLAACEGEYSQKYSTMSPLGSGAFGFVW 

TA VDKEKNKEVV\^IKKEKVLEDC WIEDPKLG 

KVTLEIAILSRVEHANIIKVLDIFENQGFFQLVME 

KHGSGLDLFAFroRHPRLDEPLASYIFRQVRAG\Q 

SRLVSAVGYLRLKDUHRDIKDENIVIAEDFTIKLI 

DFGSAAYLERGKLFYTFCGTEEYCAPEVLMGNPY 

RGPELEMWSLGVTLYTLVFEENPFCELEETVEAA 

IHPPYLVSKELMSLVSGLLQPVPERRTTLEKLVT 

DPWVTQPVNLADYTWEEVFRVNKPESGVLSAAS 

LEMGNRSLSDVAQAQELCGGPVPGEAPNGQGCL 

HPGDPRTLTS 


3861 


A 


1 


3881 


MGQKSVGASYVQIPLVPPLSRHPKGLGHEDRWS 

SYCLSSLAAQNICTSKLHCPAAPEHTDPSEPRGSV 

SCCSLLRGLSSGWSSPLLPAPVCNPNKAIFTVDA 

KTTEILVANDKACGLLGYSSQDLIGQKLTQFFLR 

SDSDVVEALSEEHMEADGHAAVVFGTVVDIISRS 

GEKIPVSVWMKRMRQERRLCCVVVLEPVERVST 

WVAFQSDGTVTSCDSLtAHLHGYVSGEDVAGQ 

HITDLIPSVQLPPSGQHIPKNLKIQRSVGRARDGT 

TFPLSLKLKSQPSSEEATTGEAAPVSGYRASVWV 

FCTISGLITLLPDGTIHGINHSFALTLFGYGKTELL 

GKNITFLIPGFYSYMDLAYNSSLQLPDLASCLDV 

GNESGCGERTLDPWQGQDPAEGGQDPRINWLA 

GGHWPRDEIRKLMESQDIFTGTQTELIAGGQLL 

SCLSPQPAPGVDNVPEGSLPVHGEQALPKDQQIT 

ALGREEPVAIESPGQDLLGESRSEPVDVKPFASCE 

DSEAPVPAEDGGSDAGMCGLCQKAQLERMGVS 

GPSGSDLWAGAAVAKPQAKGQLAGGSLLMHCP 

CYGSEWGLWWRSQDLAPSPSGMAGLSFGTPTLD 

EPWLGVENDREELQTCLIKEQLSQLSLAGALDVP 

HAELVPTECQAVTAPVSSCDLGGRDLCGGCTGS 

SSACYALATDLPGGLEAVEAQEVDVNSFSWNLK 

ELFFSDQTDQTSSNCSCATSELRETPSSLAVGSDP 

DVGSLQEQGSCVLDDRELLLLTGTCVDLGQGRR 

FRESCVGHDPTEPLEVCLVSSEHYAASDRESPGH 

VPSTLDAGPEDTCPSAEEPRLNVQVTSTPVIVMR 

GAAGLQREIQEGAYSGSCYHRDGLRLSIQFEVRR 

VELQGPTPLFCCWLVKDLLHSQRDSAARTRLFL 

ASLPGSTHSTAAELTGPSLVEVLRARPWFEEPPK 

AVELEGLAACEGEYSQKYSTMSPLGSGAFGFVW 

TA\n5KEKNKEVWKFIKKEKVLEDCWIEDPKLG 

KVTLEIAILSRVEHANIIKVLDIFENQGFFQLVME 

KHGSGLDLFAFIDRHPRLDEPLASYIFRQVRAG\Q 

SRLVSAVGYLRLKI)IIHRDIKDENIVIAEDFnKLI 

DFGSAAYLERGKLFYTFCGTIEYCAPEVLMGNPY 



454 



wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of. 
peptide 
sequence 


Amino acid sequence (A=Alanlne C«Cysteine, D^^Aspartic Acid, 
TsT'iiitamir Arirl ITsiPhenvlalanine^ G^Krlvdne. Hallistiditte. 
I-lsoleucine, K-Lysine, I^Leudne, MsMethionine, 
N^^Asparagine, FHProlioe, Q^GIutamine, R»Afgiiiine, S»Serine, 
T^Threonine, V=Valinc, W»Tryptophan, Y^l^rosine, 
X-Unlcnown, *'=^top codon, possible nudeotide deletion, 
\<=possible nudeotide insertion 










RGPELEMWSLGVILYTLVFEENPFCELEETVEAA 

IHPPYLVSKELMSLVSGIXQPWERRTTLEKLVT 

DPWVTQPVNLADYTWEEVFRVNKPESGVLSAAS 

LEMGNRSLSDVAQAQELCGGPVPGEAPNGQGCL 

HPGDPRLLTS 


3862 


A 


399 


2069 


TMDRSKRNSIAGFPPRVEVRLEEFEGGGGGEGNV 

SQVGRVWPSSYRALISAFFRLTRLDDFTCBKIGSG 

FFSEVFKVRHRASGQVMALKMNTLSSNRANML 

KEVQLMNRLSHPNILRYINSGNLEQLLDSNLHLP 

WTVRVKLAYDIAVGLSYLHFKGIFHRDLTSKNC 

LIKRDENGYSAWADFGLAEKIPDVSMGSEKLA 

WGSPFWMAPEVLRDEPYNEKADVFSYGIILCEII 

ARIQADPDYLPRTENFGLDYDAFQHMVGDCPPD 

FLQLTFNCCTSfMDPKLRPSFVEIGKTLEEILSRLQE 

EEQERDRKLQPTARGLLEKAPGVKRLSSLDDKIP 

HKSPCPRRTIWLSRSQSDIFSRKPPRTVSVLDPYY 

RPRDGAARTPKVNPFSARQDLMGGKIKFFDLPSK 

SVISLVFDLDAPGPGTMPLADWQEPLAPPIRRWR 

SLPGSPEFLHQEACPFVGREESLSDGPPPRLSSLK 

YRVKEIPPFRASALPAAQAHEAMDCSILQEENGF 

GSRPQGTSPCPAGASEEMEVEERPAGSTPATFSTS 

GIGLQTQGKQDG 


3863 


A 


399 


2069 


TMDRSKRNSIAGFPPRVEVRLEEFEGGGGGEGNV 

SQVGRVWPSSYRALISAFFRLTRLDDFTCEKIGSG 

FFSEVFKVRHRASGQVMALKMNTLSSNRANML 

KEVQLMNRLSHPNILRYINSGNLEQLLDSNLHLP 

WTVRVKLAYDIAVGLSYLHFKGIFHRDLTSKNC 

LIKRDENGYSAWADFGLAEKIPDVSMGSEKLA 

VVGSPFWMAPEVLRDEPYNEKADVFSYGIILCEII 

ARIQADPDYLPRTENFGLDYDAFQHMVGDCPPD 

FLQLTFNCCNMDPKLRPSFVEIGKTLEEILSRLQE 

EEQERDRKLQPTARGLLEKAPGVKRLSSLDDKIP 

HKSPCPRRTIWLSRSQSDIFSRKPPRTVSVLDPYY 

RPRDGAARTPKVNPFSARQDLMGGKIKFFDLPSK 

SVISLVFDLDAPGPGTMPLADWQEPLAPPIRRWR 

SLPGSPEFLHQEACPFVGREESLSDGPPPRLSSLK 

YRVKEIPPFRASALPAAQAHEAMDCSILQEENGF 

GSRPQGTSPCPAGASEEMEVEERPAGSTPATFSTS 

GIGLQTQGKQDG 


3864 


A 


3 


911 


SWNMDSDSCAAAFHPEEYSPSCKRRRTVEDFNK 

FCTFVLAYAGYIPYPKEELPLRSSPSPANSTAGTI 

DSDGWDAGFSDIASSVPLPVSDRCFSHLQPTLLQ 

RAKPSim.LDRKKTDKLKKKKKRKRRDSDAPGK 

EGYRGGLLKLEAADPYVETPTSPTLQDIPQAPSD 

PCSGWDSDTPSSGSCATVSPDQVKEIKTEGKRTI 

VR/QEAQLMARNDGNFSSLLESIFPS\DDDSWDLV 

TCFCMKPFAGRPMIECNECHTWIHLSCAKIRKSN 

VPEVFVCQKCRDSKFDIRRSNRSRTGSRKLFLD 


3865 . 


A 


3 


3573 


QERLRSRSRPDRAAREAGSARGRQPKRTERVEQ 

FLTIARRRGRRSMPVSLEDSGEPTSCPATDAETAS 

EGSVESASETRSGPQSASTAVKERPASSEKVKGG 

DDHDDTSDSDSDGLTLBCELQNRLRRKREQEPTE 

RPLKGIQSRLRKKRREEGPAETVGSEASDTVEGV 

LPSKQEPENDQGWSQAGKDDRESKLEGKAAQD 

IKDEEPGDLGRPKPECEGYDPNALYCICRQPHNN 
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wo 01/57190 



PCT/USOl/04098 



S£QID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence {A=Alanine C=Cysteinc D^Aspartic Acid, 
£<='Glutamic Acid, F^Phenylalanine, G=Glycine, H='Histidine, 
l^lsoleucine, K=Lysinc, L=Lcucine, M=Methionine, 
N-Asparagine, P=Proline, Q=GIutaminc, R'^Arginine, S=S€rine, 
T»Threonioe, V«=Valine, W-Tryptophan, Y=Tyrosine, 
X"^Unknoivn, *^tDp codon, A»possible nudeotide deletion, 
V=possible nudeotide insertion 










RFMICCDRCEEWFHGDCVGISEARGRLLERNGE 

DYICPNCHLQVQDETHSETADQQEAKWRPGDA 

DGTDCTSIGTIEQKSSEDQGIKGRIEKAANPSGKK " 

KLKIFQPGPGPVPTQLPVLWQVLEIAVSRSISAFT 

LLHCISCKVIEAPGASKCIGPGCCHVAQPDSVYCS 

NDCmKHAAATMKFLSSGKEQKPKPKEKMKMK 

PEKPSLPKCGAQAGIKISSVHKRPAPEKKETTVK 

KAVWPARSEALGKEAACESSTPSWASDHNYNA 

VKPEKTAAPSPSLLYKSTKJEDRRSEEKAAATAAS 

KKTAPPGSTVGKQPAPRNLVPKKSSFANVAAAT 

PAJKKPPSGFKGTEPKRPWLSATPSSGASAARQAG 

PAPAAATAASKKFPGSAALVGAVRKPWPSVPM 

ASPAPGRLGAMSAAPSQPNSQIRQNIRRSLKEIL 

WK/RPLFFILFRVNDSDDLIMTENEVGKIALHIEK 

EMFNLFQVTDN/RAYKSKYRSIMFNLKDPKNQG 

LFHRVLREEISIJ^VRLKPEELVSKELSTWKER 

PARSVMESRTKLHNESKKTAPRQEAIPDLEDSPP 

VSDSEEQQESARAVPEKSTAPLLDVFSSMLKDTT 

SQHRAPILFDLNCKICTGQVPSAEDEPAPKKQKLS 

ASVKKEDLKSKHDSSAPDPAPDSADEVMPEAVP 

EVASEPGLESASHPNVDRTYFPGPPGDGHPEPSPL 

EDLSPCPASCGSGVVTTVTVSGRDPRTAPSSSCT 

AVASAASRPDSTHMVEARODVPKPVLTSVMVPK 

SILAKPSSSPDPRYLSVPPSPNISTSESRSPPEGDTT 

LFLSRLSTIWKGFINMQSVAKFVTKAYPVSGCFD 

YLSEDLPDTIHIGGRIAPKTVWDYVGKLKSSVSK 

ELCLIRFHPAlteEEEVAYISLYSYFSSRGRFGV^^ 

NNMOIVKDLYLffLSAQDPWSKLLPFE 

LSGWR 


3866 


A 


2 


3181 


AQQPVGRRGGASGAGGGRRGTPRPRAGAGPGF 

QVSSGGCRLSKMRRFLRPGHDPVRERLKRDLFQ 

FNKTVEHGFPHQPSALGYSPSLRILAIGTRSGAIK 

LYGAPGVEFMGLHQENNAVTQIHLLPGQCQLVT 

LLDDNSLHLWSLKVKGGASELQEDESFTLRGPP 

GAAPSATQITVVLPHSSCELLYLGTESGNVFWQ 

LPAFRALEDRTISSDAVLQRLPEEARHRRVPEMV 

EALQEHPRDPNQILIGYSRGLWIWDLQGSRVLY 

HFLSSQQLENIWWQRDGRLLVSCHSDGSYCQWP 

VSSEAQQPEPLRSLWYGPFPCKAITRILWLTTRQ * 

G\LPFTIFQGGMPRAiSYGDRHCISVIHDGQQTAFD 

FTSRVIGFTVLTEADPAATFDDPYALWLAEEEL 

WIDLQTAGWPPVQLPYLASLHCSAITCSHHVSN 

IPLKLWERIIAAGSRQNAHFSTMEWPIDGGTSLTP 

APPQRDLLLTGHEDGTVRFWDASGVCLRLLYKL 

STVRVFLTDTDPNENLSAQGEDEWPPLRKVGSF 

DPYSDDPRLGIQKIFLCKYSGYLAVAGTAGQVLV 

LELNDEAAEQAVEQVEADLLQDQEGYRWKGHE 

RLAARSGPVRFEPGFQPFVLVQCQPPAWTSLAL 

HSEWRLVAFGTSHGFGLFDHQQRRQVFVKCTLH 

PSDQLALEGPLSRVKSLKKSLRQSFRRMRRSRVS 

SRKRHPAGPPGEAQEGSAKAERPGLQNMELAPV 

QRKIEARSAEDSFTGFVRTLYFADTYLKDSSRHC 

PSLWAGTNGGTIYAFSLRVPPAERRMDEPVRAE 

QAKEIQLMHRAPWGILVLDGHSVPLPEPLEVAH 

DLSKSPDMQGSHQLLWSEEQFKVFTLPKVSAK 
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wo 01/57190 



PCT/USOl/04098 



$£QID 


Method 


Predicted 

b^inning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

uuucuuuc 

location 
correspontUng 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequeuce (ABAIanine C>=Cysteine, D>»Aspartic Acid; 
E^GIutamic Add, F^Phenylalanine, G=Glycine, U^Histidine, 
jBlsoleudne, K-Lysine, l^Leudne, M^Methionine, 
N=Asparagine, P=ProHne, Q*Glutaminc, R=Arglnine, S=Serine, 
T=Threonine, V=Valine, W-Tryptophan, Y=Tyrosine, 
X^Unknown, *»Stop codon, /^^possible nucleotide deletion, 
>Fpossible nudeotide insertion 










LKLKLTALEGSRVRRVSVAHFGSRRAEDYGEHH 

LAVLTNLGDIQWSLPLLKPQVRYSCIRREDVSGI 

ASCVFTKYGQGFYLISPSEFERFSLSTKG\LVEPRC 

LVDSAETKNHRPGNGAGPKKAPSRARNSGTQSD 

GEEKQPGLVMERALLSDERAATGWHIEPPWGA 

ASAMAEQSEWLSVQAAR 


3867 


A 


2 


3181 


AQQPVGRRGGASGAGGGRRGTPRPRAGAGPGF 

QVSSGGCRLSKMRRFLRPGHDPVRERLKRDLFQ 

FNKTVEHGFPHQPSALGYSPSLRILAIGTRSGAIK 

LYGAPGVEFMGLHQENNAVTQIHLLPGQCQLVT 

LLDDNSLHLWSLKVKGGASELQEDESFTLRGPP 

GAAPSATQITVVLPHSSCELLYLGTESGNVFVVQ 

LPAFRALEDRTISSDAVLQRLPEEARHRRVFEMV 

EALQEHPRDPNQILIGYSRGLWIWDLQGSRVLY 

HFLSSQQLENIWWQRDGRLLVSCHSDGSYCQWP 

VSSEAQQPEPLRSLVPYGPFPCKAITRILWLTTRQ 

G\LPFTIFQGGMPRASYGDRHCISVIHDGQQTAFD 

FTSRVIGFTVLTEADPAATFDDPYALWLAEEEL 

WIDLQTAGWPPVQLPYLASLHCSAITCSHHVSN 

IPLKLWERIIAAGSRQNAHFSTMEWPIDGGTSLTP 

APPQRDLLLTGHEDGTVRFWDASGVCLRLLYKL 

STVRVFLTDTDPNENLSAQGEDEWPPLRKVGSF 

DPYSDDPRLGIQKIFLCKYSGYLAVAGTAGQVLV 

LELNDEAAEQAVEQVEADLLQDQEGYRWKGHE 

RLAARSGPVRFEPGFQPFVLVQCQPPAVVTSLAL 

HSEWRLVAFGTSHGFGLFDHQQRRQVFVKCTLH 

PSDQLALEGPLSRVKSLKKSLRQSFRRMRRSRVS 

SRKRHPAGPPGEAQEGSAKAERPGLQ>l]VffiLAPV 

QRKEEARSAEDSFTGFVRTLYFADTYLKDSSRHC 

PSLWAGTNGGTIYAFSLRVPPAERRMDEPVRAE 

QAKEIQLMHRAPVVGILVLDGHSVPLPEPLEVAH 

DLSKSPDMQGSHQLLVVSEEQFKVFTLPKVSAK 

LKLKLTALEGSRVRRVSVAHFGSRRAEDYGEHH 

LAVLTNLGDIQWSLPLLKPQVRYSCIRREDVSGI 

ASCVFTKYGQGFYLISPSEFERFSLSTKG\LVEPRC 

LVDSAETKNHRPGNGAGPKKAPSRARNSGTQSD 

GEEKQPGLVMERALLSDERAATGWHffiPPWGA 

ASAMAEQSEWLSVQAAR 


3868 


A 


1 


2497 


GDSGGPLVCEEPSGRFFLAGIVSWGIGCAEARRP 

GVYARVTRLRDWrnEATTKASMPLAPTMAPAPA 

APSTAWPTSPESPVVSTPTKSMQALSTVPLDWVT 

VPKLQECGARPAMEKPTRVVGGFGAASGEVPW 

QVSLKEGSRHFCGATVVGDRWLLSAAHCFNHT 

KVEQVRAHLGTASLLGLGGSPVKIGLRRWLHP 

LYNPGILDFDLAVLELASPLAFNKYIQPVCLPLAI 

QKFPVGRKCMISGWGNTQEGNATKPELLQKASV 

GHDQKTCSVLYNFSLTDRMICAGFLEGKVDSCQ 

VSGIKALYESELADARRVLDETARERARLQIEIG 

KLRAELDEVNKSAKKREGELTVAQGRVKDLESL 

FHRSEVELAAALSDKRGLESDVAbLliAyi-AJsJV^ 

DGHAVAKKQLEKETLMRVDLENRCQSLQEELDF 

RKSVFEEEVRETRRRHERRLVEVDSSRQQEYDFK 

MAQALEELRSQHDEQVRLYKLELEQTYQAKLDS 

AKLSSDQNDKAASAAREELKEARMRLESLSYQL 

SGLQKQASAAEDRIRELEEAMAGERDKFRKMLD 
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wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A*=Alanine C=Cysteioe, D^Aspartic Acid, 
E>=Glutaniic Acid, F-Phenylaianine, G=Glycine, H^Uistidine, 
I=l5oleucine, K-Lysioe, Lr'Leudne, M-Metliionine, 
N"Asparaglne, PNProline, Q^Glutamine, R-Arginlne, S=Serine, 
TsThreonine, V=Vallne, W=TryptopIian, Y=Tyrosine, 
X^Unknown, *»Stop codon, /possible oodeotide ddetion, 
V=pos9ble nndeotide insertion 










AKEQEMTEMRDVMQQQLAEYQELLDVKLALD 

MEINAYRKLLEGEEERLKLSPSPSSRVTVSRATSS 

SSGSLSATGRLGRSKRKR\WRWRSPW\QRPKRPG 

HGHGWQRWLPPGPAGLGLGQR\HIEEIDLEGKFV 

QLKNNSDKDQSLGNWRJKRQVLEGEEIAYKFTP 

KYIIiUGQMVTVWAAGAGVAHSPPSTLVWKGQ 

SSWGTGESFRTVLVNADGEEVAMRTVKKSSVM 

RENENGEEEEEEAEFGEEDLFHQQGDPRTTSRGC 

YVM 


3869 


A ■ 


1 


1942 


RYRAGIPGDGRKDYIRLTRPGLTLPGRAMFARGS 

RRRRSGRAPPEAEDPDRGQPCNSCREQCPGFLLH 

GWRKICQHCKCPREEHAVHAVPVDLERIMCRLIS 

DFQRHSISDDDSGCASEEYAWVPPGLKPEQVYQ 

FFSCLPEDKVPYVNSPGEKYRIKQLLHQLPPHDS 

EAQYCTAL\EE\EEKKELRAFSQQRKREha.G/RLG 

IVRIFPVTI'nGAI\CEECGKQIGGGDIAVF\ASRASL 

GLLLGQPSCFWCTTCQELLVDLIYFYHVGKVYC 

GRHHAECLRPRCQACDEIIFSPECTEAEGRHWHM 

DHFCCFECEASLGGQRYVMRQSRPHCCACYEAR 

HAEYCDGCGEHIGLDQGQMAYEGQHWHASDRC 

FCCSRCGRALLGRPFLPRRGLIFCSRACSLGSEPT 

APGPSRRSWSAGPVTAPLAASTASFSAVKGASET 

TTKGTSTELAPATGPEEPSRFLRGAPHRHSMPEL 

GLRSVPEPPPESPGQPNLRPDDSAFGRQSTPRVSF 

RDPLVSEGGPRRTLSAPPAQRRRPRSPPPRAPSRR 

RHHHHNHHHHHNRHPSRRRHYQCDAGSGSDSE 

SCSSSPSSSSSESSEDDGFFLGERIPLPPHLCRPMP ^ 

AQDTAMETFNSPSLSLPRDSRAGMPRQARDKNC 

IVA 


3870 


A 


2 


3485 


FVWRWYVHASCMPPRARSWEGAHAPVGMHV 

AEAHACSSQQQQMPPAQFWMLEWLLHLCAFLS 

TPSFPHWCCCSNPHGSIADKPEEIVPASKPSRAAE 

NMAVEPRVATIKQRPSSRCFPAGSDMNSVYERQ 

GIAVMTPTVPGSPKAPFLGIPRGTMRRQKSIDSRI 

FLSGITEEERQFLAPPMLKFTRSLSMPDTSEDIPPP 

PQSWPSPPPPSPTTYNCPKSPTPRVYGTIKPAFNQ 

NSAAKVSPATRSDTVATMMREKGMYFRRELDR 

YSLDSEDLYSRNAGPQANFRNKRGQMPENPYSE 

VGKIASKAVYVPAKPARRKGMLVKQSNVEDSPE 

KTCSIPIPTIIVKEPSTSSSGKSSQGSSMEIDPQAPE 

PPSQLRPDESLTVSSPFAAAIAGAVRDREKRLEA 

RRNSPAFLSADLGDEHVGLGPPAPRTRPSMFPEE 

GDFADEDSAEQLSSPMPSATPREPENHFVGGAEA 

SAPGEAGRPLNSTSKAQGPESSPAVPSASSGTAG 

PGNYVHPLTGRLLDPSSPLALALSARDRAMKES 

QQGPKGEAPKADLNKPLYIDTKMRPSLDAGFPT 

VTRQNTRGPLRRQETENKYETDLGRDRKGDDK 

KNMLmiMDTSQQKSAGLLMVHTVDATKLDNA 

LQEEDEKAEVEMKPDSSPSEVPEGVSETEGALQI 

SAAPEPTTVPGRTIVAVGSMEEAVILPFRIPPPPLA 

SVDLDEDFIFTEPLPPPLEFANSFDIPDDRAASVPA 

LSDLVKQKKSDTPQSPSLNSSQPTNSADSKKPAS 

LSNCLPASFLPPPESFDAVADSGIEEVDSRSSSDH 

HLETTSTISTVSSISTLSSEGGENVDTCTVYADGQ 

AFMVDKPPVPPKPKMKPHHKSNALYQDALVEE 
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wo 01/57190 



PCTAJSOl/04098 



S£QII> 
NO: 


Method 


Predicted 

pegiuuing 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

pepUde 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine 0<:ysteine, D=Aspartic Acid, 
E=Glutamtc Acid, F^Pbenylalanine, G=GIycine, H=Histidine, 
I^Isolendne, K=Lysine, L=Leucine, M=Mcthionine, 
N=Asparaginc, P=ProUnc Q=Glutamine, R=Arginlne, S==Serinc, 
T=TIireoninc V=Valinc, W=Tryptophan, Y=Tyrosine, 
X=Unluiown, *=»Stop codon, /^possible nucleotide deletion, 
\=pos5ible nudeotide insertion 










DVDSFVIPPPAPPPPPGSAQPGMAKVLQPRTSKL 

WGDVTEKSPILSGPKANVISELNSILQQMNREKL 

AKPGEGLDSPMGAKSASLAPRSPEIMSTISG'mST 

TVTFTVRPGTSQPITLQSRPPDYESRTSGTRRAPS 

PVVSPTEMNKETLPAPLSAATASPSPALSDVFSLP 

SQPPSGDLFGLNPAGRSRSPSPSILQQPISNKPFTT 

KPVHLWTKPDVADWLESLNLGEHKEAFMDNEI 

DGSHLPNLQKEDLIDLGVTRVGHRMNIERALKQ 

LLDR 


3871 


A 


35 


1171 


VESRSAWHEGEDQIDRLDFIRNQMNLLTLDVKK 

KIKEVTEEVANKVSCAMTDEICRLSVLVDEFCSE 

FHPNPDVLKIYKSELNKHIEDGMGRNLADRCTD 

EVNALVLQTQQEUENLKPLLPAGIQDKLHTLIPC 

KKFDLSYNLNYHKLCSDFQEDIVFRFSLGWSSLV 

HRFLGPRNAQRVLLGLSEPIFQLPRSLASTPTAPT 

TPATPDNASQEELMITLVTGLASVTSRTSMGinV 

GGVIWKTIGWKLLSVSLTMYGALYLYERLSWTT 

HAKERAFKQQFVNYATEKLRMIVSSTSANCSHQ 

VKQQIATTFARLCQQVDITQKQLEEEIARLPKEID 

QLEKIQNNSKLLRNKAVQLENELENFTKQFLPSS 

NEES 


3872 


A 


35 


1171 


VESRSAWHEGEDQEDRLDFIRNQMNLLTLDVKK 

BOKEVTEEVANKVSCAMTDEICRLSVLVDEFCSE 

FHPNPDVLKIYKSELMQflEDGMGRNLADRCTD 

EVNALVLQTQQEIIENLKPLLPAGIQDKLHTLIPC 

KKFDLSYNLNYHKLCSDFQEDIVFRFSLGWSSLV 

HRFLGPKNAQRVLLGLSEPIFQLPRSLASTFTAPT 

TPATPDNASQEELMITLVTGLASVTSRTSMG'mV 

GGVIWKTIGWKLLSVSLIMYGALYLYERLSWTT 

HAKERAFKQQFVNYATEKLRMIVSSTSANCSHQ 

VKQQIATTFARLCQQVDITQKQLEEEIARLPKEID 

QIJEKIQIWSKIJLRNKAVQLENELENFTKQFLPSS 

NEES 


3873 


A 


2944 


2089 


PVCTALTPGRMTDDKDVLRDVWFGRIPTCFTLY 

QDEITEREAEPYYLLLPRVSYLTLVTDKVKKHFQ 

KVMRQEDISEIWFEYEGTPLKWHYPIGLLFDLLA 

SSSALPWNITVHFKSFPEKDLLHCPSKDAIEAHF 

MSOVIKEADALKHKSQVIhreMQBaCD 

LQNDRFDQFWAINRKLMEYPAEENGFRYIPFRIY 

QTTTERPFIQKLFRPVAADGQLHTLGDLLKEVCP 

SAIDPEDGEKKNQVMMGIEPMLETPLQWLSEHL 

SYPDNFLHISnPQPTD 


3874 


A 


776 


366 


■ QARGAPSSPMCPLPLAAAAVAAPRAPLRLLNRG 
LAAAMSTAQSLKSVDYEVFGRVQGVCFRMYTE 
DEARKIGVVGWVKNTSKGTVTGQVQGPEDKVN 
SMKSWLSKVGSPSSRIDRTNFSNEKTISKLEYSNF 
SIRY 


3875 


A 


1081 


182 


SLSSCQTDPRPMSAPLDAALHALQEEQARLKMR 
LWDLQQLRKELGDSPKDKVPFSVPKIPLVFRGHT 

g-\r\T\TiZT\.Ttxv^t "i^OXTT "DTU/^DT T A/^C AT mTTinPUfVA 
QQDPli Vr jvoi» V oiN LKJLriUrJLl-»Au oAJ-«i i r J-^J-ir Jv v /\ 

EQVLQQKEHTINMEECRLRVQVQPLELPMVTTIQ 

VMVSSQLSGRRVLVTGFPASLRLSEEELLDKLEIF 

FGKTRNGGGDVDVRELLPGSVMLGFARDGVAQ 

RLCQIGQFTVPLGGQQVPLRVSPYVNGEIQKAEI 

RSQPVPRSVLVLNIPDILDGPELHDVLEIHFQKPT 
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wo 01/57190 



PCT/USOl/04098 



SEQlD 
NO: 


Method 


Predicted 
beginning 

uucicuiluc 

location 
corresponding 
to first amino 
acid residue of 

peptide 
sequence 


Predicted end 

nucleotide 

lucauuu 

correspoDdIng 
to last amino 
acid residue of 
peptide 

sequence 


Amino acid sequeoce (A^AIaniae C^Cysteioe, I>°=Aspartic Acid, 
E^Glutaniic Acid, F=PheDylalanine, G^lycine, H^Histidine, 

NsAsparagine, P^Proline, Q==Glntamlne« R^Arginine, S^Serine, 
T^Threonine, V^Vallne, W=TryptopIian, Y==Tyrosine, 
X-UnlcDOwn, *=Stop codon, /=passible nucleotide deletion, 
^possible nucleotide insertion 










RGGGEVEALTWPQGQQGLAVFTSESG 


3876 


A 


26 


431 


RMMKCPQALLAIFWLLLSWVSSEDKWQSPLSL 
WHEGDTVTLNCSYEVTNFRSLLWYKQEKKAPT 
FLFMLTSSGIEKKSGRLSSILDKKELSSILNITATQ 
TGDSAIYLCAVEAQCSLVTCSLYSNSTAEALQL 


3877 


A 


3 


1291 


KAFRLLAERGAAAAMLWSGCRRFGARLGCLPG 

GLRVLVQTGHRSLTSCmPSMGLNEEQKEFQKV 

AFDFAAREMAPNMAEWDQKELFPVDVlVfiUCAA 

QLGFGGVYIQTDVGGSGLSRLDTSVIFEALATGC 

TSTTAYISIHNMCAWMIDSFGNEEQRHKFCPPLC 

TMEKFASYCLTEPGSGSDAASLLTSAKKQGDHYl 

LNGSKAnSGAGESDIYWMCRTGGPGPKGISCIV 

VEKGTPGLSFGKKEKKVGWNSQPTRAVEFEDCA 

VPVANRIGSEGQGFLIAVRGLNGGRINIASCSLGA 

AHASVILTRDHLNVRKQFGEPLASNQYLQFTLA 

DMATRLVAARLMVRNAAVALQEERKDAVALCS 

MAKLFATDECFAICNQALQMHGGYGYLKDYAV 

QQYVRDSRVHQILEGSNEVMRILISRSLLQE 


3878 


A 


10 


1014 


LPGSTISSSGCQAPGRADSSGGARNSRRGDSRPG 

SCNRQAVAPPCPSPGPQSRHWIHRGTAPQAGETR 

TLGRGSSAPNACSASVTPCCPSSPPS*SCL*PTRRS 

PQNSSSTEVYRGFWQHGLPST**PFSS*QWPGQH 

TQGCSKLLGKQTTHLPCSTWPA**PSPSCLTRFR* 

W*PSLMCLWASSCSVCV*SPSGSCRH*LWGTHST. 

SRTC*ARRSSALPTGLCTDDTSWASSSKARPCAL 

QRPSSLSSLSPCLTC*W*LSSSSPMSARSPAGAET 

GSWATGSFRiiTQmiSSRLTSTSHSARSAWK^ 

testpswprpVswtsgedpaspapai 


3879 


A 


200 


699 


llltgyiqtlqnqqlsgnqqemqavdnltsapg 

ntslctrdykitqvlfpllytvlffvglitngla 

mriffqirsksnfiiflkntvisdllmiltfpfkils 

daklgtgplrtfvcqvtsvifyftmyisisflglit 

idryqkttrpfktsotknllgakilk 


3880 


A 


26 


169 


qpetdtmvhltpeeksavtalwgkvnvdedag 
ddlcqilvdrprlri 


3881 


A 


37 


1100 


tplfdfwpgfvlswlqplsaslrarraasgppac 

rimpttvddvlehggefhffqkqmffllallsat 

fapiyvgivflgftpdhrcrspgvaelslrcgwsp 

aeelnytvpgpgpageasprqcrryevdwnqst 

fdcvdplasldtnrsrlplgpcrdgwvyetpgss 

ivtefnlvcanswmldlfqssvnvgffigsmsig 

yiadrfgrklcllttvlinaaagvlmaisptytw 

mlifrliqglvskagwligyilitefvgrryrrtv 

gifyqvaytvgllvlagvayalphwrwlqftv 

alpnfffllyywcipesprwlisqnknaeamrjik 

hiakkngbcslpasl 


3882 


A 


573 


1620 


kskcrfpeglsegfgpmrkealssgsvqeaeam 
ldepqeqaegsltvyvisehssllpqdmmsyigp 
krtawrgimhreafnngrriyqvaqamslted 
vlaaaladhlpedkwsaek31rplksslgyeitfs 

LLNPDPKSHDVYWDffiGAVRRYVQPFLNALGAA 
GNFSVDSQILYYAMLGVNPRFDSASSSYYLDMH 
SLPHVINPVESRLGSSAASLYPVLNFLLYVPELAH 
SPLYIQDKDGAPVATNAFHSPRWGGIMVYNVDS 
KTYNASVLPVRVEVDMVRVMEVFLAQLRLLFGI 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C==Cysteine, D-Aspartic Acid, 

T? 1 A A J 17^'m.AM'»lnl<iMtv>A /'^'—^ MA T¥ T¥i lilt lit ■■ n 

E^GIutamic Acid, i'^rnenyjaianine, iviycinei n— ixisuuine) 
I^lsolencine, K=Lysine, L^Leucine, M-Methionine, 
N^Asparagine, P^Proiine, Q^GIutamine, R^Aiiginine, S=Serine» 
T^^hreonine, V=Valinc W=Tryptophan, Y=ayroslne, 
X"Unknown, **=Stop codon, A=possiblc nudeotide ddetion, 
\Fpossible nudeotide insertion 










AQPQLPPKCLLSGPTSEGLMTWELDRLLWARSV 
ENLATATTTLTSLA 


3883 


A 


2369 


844 


RIHREEDFQFILKGIARLLSNPLLQTYLPNSTKKIQ 

FHQELLVLFWKLCDFNKVGQPRGALQGDGEQLP 

Q*PGGRDSVRLRGVGQSCPSLELSPLGPSPHP*KF 

LFFVLKSSDVLDDLVPILFFLNDARADQSRVGLM 

fflGVFELLLLSGECNFGVRLNKPYSIRVPMDIPVF 

TGTHADLLIVWFHKnXSGHQRLQPLFDCLLTIVV 

NVSPYLKSLSMVTANKLLHLLEAFSTTWFLFSAA 

QNHHLVFFLLEVFNNIIQYQFDGNSNLVYAIIRKR 

SIFHQLANLPTDPPTIHKALQRRRRTPEPLSRTGS 

QGGAPPWRAPAPLPLQSQAPSRPVWWLLQALTS 

♦PRSPRCQRMAPCGPWNLSPSRAWRMAARLRGS 

PARHGGSSGDRP/HSSASGQWSPTPEWVLSWKS 

KLPLQTIMRLLQVLVPQVEKICIDKGLTDESEILR 

FLQHGTLVGLLPVPHPILIRKYQANSGTAMWFRT 

YMWGVIYLRNVDPPVWYDTDVKLFEIQRV 


3884 


A 


I 


804 


NGPRAPFSQEGQSTGPPPLIPRLGQHGAQGRIPPL 

NPGQGPGPNKDDSRGPPNHHMGPMSERRHEQSG 

GPEHGPERGPLRGGQDCRGPPDRRGPHPDFPDDF 

SRPDDFHPDKRFGHRLREFEGRGGPLPQEEKWR 

RGGPGPPFPPDHREFSEGDGRGAARGPPGAWEG 

RRPGG*TFPPGSRGPTFS/SGAEEESFRRGAPPRHE 

GRAPPRGRDGFPGPEDFGPEENFDASEEAARGRD 

LRGRGRGTPRGERVTKDTWSGRIGCRfflWL 


3885 


A 


3 


996 , , . , 


GRRRAGPAHSARMYNMMETELKPPGPQQTSGG 
-^GGGNSTAAAAGGJvTQKNSPDRVKRPMNAFMW 
SRGQRRKMAQENPKMHNSEISKRLGAEWKLLSE 
TEKRPFIDEAKRLRALHMKEHPDYKYRPRRKTK 
TLMKKDKYTLPGGLLAPGGNSMASGVGVGAGL 
GAGVNQRMDSYAHMNGWSNGSYSMMQDQLG 
YPQHPGLNAHGAAQMQPMHRYDVSALQYNSM 
TSSQTYMNG/SRPTYSMSYSQQGTPGMAPGS\MG 
SWKSEASSSPPVVTSSSHSRAPCQAGDLRDMIS 
MYLPGAEVPEPAAPSRLHMSQHYQSGPVPGTAI 
NGTLPLSHM 


3886 


A 


773 


317 


QCTQKAAEGYTQFYYVDVLDGKLACVNKCTKG 
TKSQMNCNLGTCQLQRSGPRCLCPNTNTHWYW 
GETCEFNIAKSLVYGIVGAVMAVLLLALIILIILFS 
LSQ\RKRfIRPESEGEADFGLENATNNFG\PTLETV 
DSGTELHIQ\RPEMVASTV 


3887 


A 


3 


466 


VDFRVKTLLVDNKCFVLQLWDTAGQERYHSMT 

RQLLRKADGVVLMYDITSQESFAHVRYWLDCL 

QDAGSDGVVILLLGNKMDCEEERQVSVEAGQQL 

AQELGVYFGECSAALGHNILEPWNLARSLRMQ 

EEGLKDSLVKVAPKRPPKRFGCCS 


3888 


A 


3412 


3144 


QNIDITNFSSSWNDGLAFCALLHTYLPAHIPYQEL 

NSQDKRRNFMLAFQAAESVGIKSTLDINEMVRT 

ERPDWQNVMLYVTAIYKYFET 


3889 


A 


1 


1160 


LVVTAITAELAFPNEYTRMSTSELISELFNDCGLL 

DSSKLCDYENRFNTSKGGELPDRPAGVGVYSAM 

WQLALTLILKIVITIFTFGMKIPSGLFIPSMAVGAI 

AGRLLGVGMEQLAYYHQEWTVFNSWCSQGAD 

CITPGLYAMVGAAACLGGVTRMTVSLWIMFEL 

TGGLEYIWLMAAAMTSKWVADALGREGIYDA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C=Cysteine, D=Aspartic Acid, 
jE>=G]utamic Add, F=PbenylaJanine, G=GJydnc, H=Histidine, 
1-IsoIeudne, K^Lysine, L^Leucine, M-Metbionine, 
N»Asparagine, P=Proline, Q^Glntamioe, R^Arginioe, S=Serine, 
T»Threonine» V^Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *^top codon, A^possible nudeotide ddetion, 
\=possible nucleotide insertion 










HIRLNGYPFLEAKEEFAHKTLAMD\qviKPRRN^ 

LLTVLTQDSMTVEDVETTISETTYSGFPVVVSRES 

QRLVGFVLRRDLnSIENARKKQDGWSTSnYFTE 

HSPPLPPYTPPTLKLRNILDLSPFrVTDLTPMEIVV 

DIFRKLGLRQCLVTHNGRLLGnTKKDVLKHIAQ 

MANQDPDSILFN 


3890 


A 


1 


387 


SWCWTGIFVLGrrNLRLEGSWYRSLWGPGFNTT 
TATLGFGAPQAPVGDVALNQPDMCVYRRGRKK 
RWYTKXQLKELENEYAINKFINKDKRRRISAAT 
NLSERQVTIWFQNRRVKDKKIVSKLKDTVS 


3891 


A 


2 


2914 


RGGGGDHKMADLSLLQEDLQEDADGFGVDDYS 

SESDVIIIPSALDLAST/QDEMVERPLGRL\DK\YA 

ASENHI*PDKMVAPEFASIPLRE\VCDDERDCIAV 

LGKN*PDWADDSEPTVVRAAELEQVPHIALFLFK 

KTEa.SITICFFSKFLLPYCGLDTLADQN\NQVRKT 

SQAALL\ALLEQELIERFDVETKVCPVLIELTAPDS 

>n>DVKTEAVAIMCKMAP\MVGKDITERLILPRFC 

EMCCDCRMFHWRKWCAANFGDICSWGQQAT 

EEMLLPRFFQLCSDNVWGVRKACAECFMAVSC 

ATCQEIRRTKLSALFINLISDPSRWVRQAAFQSLG 

PFISTFANPSSSGQYFKEESKSSEEMSVENNKRTR 

DQEAPEDVQVRPEDTPSDLSVSNSSVILENTMED 

HAAEASGKPLGEISVPLDSSLLCTLSSESHQEAAS 

NENDKKPGNYKSMLRPEVGTTSQDSALLDQELY 

NSFHFWRTPLPEIDLDIELEQNSGGKPSPEGPEEE 

SEGPVPSSPNITMATRKELEEMIENLEPHIDDPDV 

KAQVEVLSAAliRASSLDAHEETISIEKRSbLQDE 

LDINELPNCKINQEDSVPLISDAVENMDSTLHYIH 

NDSDLSNNSSFSPDEERRTKVQDVVPQALLDQY 

LSMTDPSRAQTVDTEIAKHCAYSLPGVALTLGR 

QNWHCLRETYETLASDMQWKVRRTLAFSIHELA 

VILGD\QLTAADLVPIFNGFLK*PSMKSRIGVLKH 

LHDFLKLLHIDKRREYLYQLQEFLVTDNSKNrWR 

FRAELAEQLILLLELYSPRDVYDYLRPIALNLCAD 

KVSSVRWISYKLVSEMVKKLHAATPPTFGVDLIN 

ELVENFGRCPKWSGRQAFVFVCQTVIEDDCLPM 

DQFAVHLMPHLLTLANDRVPNVRVLLAKTLRQT 

LLEKDYFLASASCHQEAVEQTIMALQMDRDSDV 

KYFASIHPASTKISEDAMSTASSTY 


3892 


A 


158 


2191 


VPLPAPSGLSGGGSRGAGCKKAPPGRAPAPGLAP 

LRPSEPTMAVPPGHGPFSGFPGPQEHTQVLPDVR 

LLPRRLPLAFRDATSAPLRKLSVDLIKTYKHINEV 

YYAKKKRRAQQAPPQDSSNKKEKKVLNHGYDD 

DNHDYIVRSGERWLERYEIDSLIGKGSFGQVVKA 

YDHQTQELVAIKIIKNKKAFLNQAQIELRLLELM 

NQHDTEMKYYIVHLKRHFMFRN\HLCLVFELLS 

YNLYDLLRNTHFRGVSLNLTRKLAQQLCTALLF 

LATPELSIIHCDLKPENILLCNPKRSAIKIVDFGSS 

CQLGQRIYQYIQSRFYRSPEVLLGTPYDLAIDMW 

SLGCILVEMHTGEPLFSGSNEVCPQEGVDQMNRI 

VEVLGIPPAAMLDQAPKARKYFERLPGGGWTLR 

RTKELRKDYQGPGTRRLQEVLGVQTGGPGGRRA 

GEPGHSPAD\Y\LRFQDLVLRMLEYEPAARISPLG 

ALQHGFFRRTADEATNTGPAGSSASTSPAPLDTC 

PSSSTASSISSSGGSSGSSSDNRTYRYSNRYCGGP 
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SEQID 


Method 

■ 


Predicted 

hMvinniitiF 

nucleotide 

lOCfltiOD 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end . 
Dudeodde 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»AI&nine OCysteine, D=Aspartic Acid, 
EF=Glutamic Acid, F=Plienylalanine, G^GIycine, H=Histidine, 
I^Isoleucine, K=Lysinc, L=Leucinc, M=Metliioninc, 
N=Asparagine, P=Proline, Q==Glutaminc, R='Arglnine, S=Serine, 
T^TIirconinc, V^Valine, W^Tryptoplian, Y-Tyrosine, 
X=^Unlu)own, *=Stop codon, A«possible nucleotide deletion, 
V»pos8ible nucleotide insertioD 










GPPITDCEMNSPQVPPSQPLRPWAGGDVPHKTH 
QAPASASSLPGTGAQLPPQPKYLGRPF^r i blrFft' 
ELMDVSLVGGPADCSPPHPAPAPQHPAASALRT 
RMTGGRPPLPPPDDPATLGPHLGLRGVPQSTAAS 
S 


3893 


A 


68 


258 


PEEYYPFSPTLQQLFFFLLDSDMGSRPESMGCRK 
NTVPRPASPTEAGTDPQTFLHTWVSECRD 


3894 


A 


1120 


136 


SLPLAPAPAVAGPVALCPAGLCPAQPGMPAGPA 

AASGSHPEVGSVLQRSSQPHWPNPWPGAGHLPP 

PAGPFPYNPPAGPGAAAGLA*SPPRSSPTPCSVGP 

QSCPANASAPPAQPCLAGAPPAASLPPPGPGSVS 

AAPAPGGPAPAEPPLGVPPVPAWLLPDSPPLPGT 

HSGPPPAAVSLPPAAAACPVVVPPPLPHHPPDLES 

PSAAAPNPGCAGGIRHFPPGSPEASSPLRPAAAPA 

LLPLPRPPS*PA^PWKPLHSPVAVAGGSFVAGGSV 

LPAPDLDQPRPSGPPAASPTPGPGVAQPPPGSAVL 

PTVP*APPVSGAAPGRKREW 


3895 


A 


2 


1347 


FGAVSYRPGNGSCWVKVTASSDLSDLISCLCPPR 

SLCSSQACVLPVPGPSLLLPQGLHVGCASAGTRW 

PLSCSIDFQRLLAHEEETQKRRAKESGMAFTQLT 

FRDVAIEFSQDEWKCLNSTQRTLYRDVMLENYR 

NLVSLDLSRNCVEKELAPQQEGNP/ARSIPHSDIGT 

T*KT*H*RVLLQGNQEKNTRL*LSVER**KKLQQ 

SDYGPKRKSYL*ERPTR*KRYRKQVY*TSA\*LSF 

LPHPHELQQFQAEGKIYECNHVEKSVNHGSSVSP 

PQIISSTIKTHVSNKYGTDFICSSLLTQEQKSCIRE 

KPYRYIEGDKALNHGSHMTVRQVSHSGEKGYKC 

DLCGKVFSQKSNLARHWRVHTGEKPYKCNECD 

RSFSRNSCLALHRRVHTGEKPYKCYECDKVFSR 

NSCLALHQKTHIGEKPYTCKECGQAFSVRSTLTN 

HQVIHSDK 


3896 


A 


202 


498 


MVQSCSAYGCKNRYDKDKPVSFHKFPLTRPSLC 

KEWEAAVRRKNFKFIXYSSICSEHFTPIXJFKR^ 

NNKLLKENAVPTIFLCTEPHDKKEDLLEPQEQ 


3897 


A 


2 


382 


SHGLSRAPHLSAAPAPALASRPCFSSAPCSQGGG 
GGGPATMIHFILLFSRQGKLRLQKWYITLPDKER 
KKITREIVQIILSRGHRTSSFVDWKELKLV YJsJ<Y A 
SLYFCCAIENNQDNELLTLENVHR 


3898 


A 


718 


305 


SEQEPLLGDTPGSREWDILETEEHYKSRWRSIRIL 
YLTMFLSSVGFSVVMMSIWPYLQKIDPTADTSFL 
GWVIASYSLGQMVASPIFGLWSNYRPRKEPLIVSI 
LISVAANCLYAYLHIPASHNKYYMLVARGLLGIG 


3899 


A 


24 


718 


FRGRPGPEREGKGNHSFVEVARVIWDLHSRLU 
GAMAERKGTAKVDFLKKjEKEIQQK wD 1 bKvr b 
WASNLEKQTSKGKYFVTFPYPYMNGRLHLGHT 
FSLSKCEFAVGYQRLKGKCCLFPFGLHCTGMPIK 
ACADKLKREIELY/GCPPDFPDEEEEEEETSVKTE 
DniKDKAKGKKSKAA/AKAGSSKYQWOlMKbbU 
LSDEEIVKFSEAEHWLDYFNALAIQDLKRMG 


3900 


A 


360 


1 


VPAT«1«INV^PSSSESSEPDLSSRSSSSDAPSSSPSVP 
SPCSLSLSSPESPLLPTLLSSKSPAGSAGPTCGCPS 
GPGLRATA/PSRLSSSIAAH/SSSAPETSRPAAARE 
RSPPLHDRESHE 


3901 


A 


193 


345 


GEWAVPPAPGGQGVSIPHGPEPGQGSGVHIAPRQ 
GEGSDRTEPLICPKAAP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, I>*Aspartic Add, 
£=Glutaniic Acid, F=PbenylaIanine, G==GIycine, H=Hi5tidine, 
l=lsoleucine, K=Lysine, L=Leucine, M^Methionine, 
N=«Asparagine, P»ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T-Threonine, V=Va!ine, W^Tryptophan, Y=Tyrosine, 
X«=XJnkRown, *=Stop codon, ^possible nndeotide deletion, 
\=pos5ibIe nucleotide insertion 


3902 


A 


118S 


1389 


NPAARSAAAREGSPALPPPPVS/SSSGLGLLLPLSP 
PGSHAANPALSPRAPHSHYRPRPRCGPRRRPR 


3903 


A 


63 


396 


NNMRNPHLSSNHYLM.ARTETWARMESVKQRI 
LAPGKEGLKNFAGKSLGQIYRVLEKKQDTGETIE 
LTEIX3KPL*VPERKAPLCDCTCFGLPRRYIIAIMS 
GLGFCISFG 


3904 


A 


732 


1046 


AMSECPLILYIHKHIDTYSQSYLFNDLFYPVYSGG 
RMVTYEHLREVVFGKSEDEHYPLW*VLFGK*YA 
VAPNALMFIRFM*NCTFVPKLP*VMDLK**LQYK 
SR 


3905 


A 


46 


910 


QPPPPPPPPPSPPPPPFPPARALSHLRLHPDACLFPS 

PFPLPCSTMPGMMEKGPELLGKNRSANGSAKSP 

AGGGGSGASSTNGGLHYSEPESGCSSDDEHDVG 

MRVGAEYQARIPEFDPGATKYTDKDNGGMLVW 

SPYHSIPDAKLDEYIAIAKEKHGYNVEQALGMLF 

WHKHNIEKSLADLPNFTPFPDEWTVEDKVLFEQ 

AFSFHGKSFHRIQQMLPDKTIASLVKYYYSWKK 

TRSRTSLMDRQARKI.ANRHNQGDSDDDVEETHP 

MDGNDSDYDPKKEAKKEGMS 


3906 


A 


2 


513 


KVCNCCSQELETSFTYVDKNINLEQRNRSSPSAK 
GHNHPGELGWENPNEWSQEAAISLISEEEDDTSS 
EATSSGKSIDYGFISAILFLVTGILLVnSYIVPREV 
TVDPNTVAAREMERLEKESARLGAHLDRCVIAG 
LCLLTLGGVELSCLLMMSMWKGELYRRNRFAS 


3907 


A 


71 


412 


lUMSNCLQNFLKITSTRLLCSRLCQQLRSKRKFF 

GTVPISRLHRRyvrrpiGLVTPLGVGTHLVWDRLI 

GGE$GI)/SEyGfeESiK:SE^^ 

NEQNFVSKSD* 


3908 


A 


77 


746 


LGTLLGWRAPLFSRCLAFHSPFELLNTPKLVKTAE 

LPPDRlsTYVLGAHPHGIMCTGFLCNFSTESNGFSQ 

LFPGLRPWLAVLAGLFYLPVYRDYIMSFGLCPVS 

RQSLDFILSQPQLGQAVVIMVGGAHEALYSVPGE 

HCLTLQKRKGFVRLALRHGASLVPVYSFGENDIF 

RLKAFATGSWQHWCQLTFKKLMGFSPCIFWGR 

GLFSATSWGLLPFAVPITTV 


3909 


A 


1 


793 


FRAAGRPAAAMGDIPWGLSSWKASPGKVTEAV 

KEAIDAGYRHFDCAYFYHNEREVGAGIRCKIKE 

GAVRREDLLIATKLWCTCHKKSLVETACRKSLK 

ALKLNYLDLYLIHWPMGFKPPHPEWIMSCSELSF 

CLSHPRVQDLPLDESNMVIPSDTDFLDTWEAME 

DLVrrGLVKNIGVSNFNHEQLERLLNKPGLRFKP 

LTNQIECHPYLTQKNLISFCQSRDVSVTAYRPLG 

GSCEGVDLEDNPVIKiUAKEHGKSPAQILI 


3910 


A 


202 


705 


FFTMHRKKVDNRIRILIENGVAERQRSLFWVGD 

RGKDQWILHHMLSKATVKARPSVLWCYKKEL 

GFSSHRKKRmQLQKKDCNGTLNIKQDDPFELFI 

AATNIRYCYYNETHKILGNTFGMCVLQDFEALTP 

NLLARTVETVEGGGLVVILLRTMNSLKQLYTVT 

M 


3911 


A 


3 


723 


AGRGARAAGEGGGPFKSRPRPLPSSRSLPAVGGG 

RYGADKMAAGGAVAAAPECRLLPYALHKWSSF 

SSTYLPENILVDKPNDQSSRWSSESNYPPQYLILK 

LERPAWQMTFGKYEKTHVCmKKFKVFGGMN 

EE>nvrEELLSSGLKl^YNKETFTLKHKroEQMF^ 

RFIKrsa>lXSWGPSFOTSIWYVELSGIDDPDIVQPC 



464 



wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

b^^nidng 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucicouuc 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A'^AIanlne C^Cysteine, D^Aspartic Acid, 
E^Glutamic Add, F=Phenylalaniney G~G!ydne, H'^Histidine, 
I-Isoleudne, K«Lysine, Lr>Leudne, M=Methionine, 
N^Asparagine, P=Prolinc, Q=Glutamine, R=Argiaine, S==Serinc, 
T'=*Threonine, V^Vallnc, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *^top codon, /-possible nudeotide deletion, 
V«possable nudeotide insertion 










LNWYSKYREQEAIRLCLKHFRQHNYTCAFESLQ 
KK.T 


3912 


A 


2 


461 


FEKKQLRRPSLFLLGCCSFGIMAPSLWKGLEGIG 

LFAIJSJIAAFSAAQHRSYMRLTEKEDESLPIDIV^ 

QTLLAFAVTCYGIVHIAGEFKDMDATSELKNKTF 

DTVRNHPSFYVFNHRGSEYFSGPSDTANSSNQDA 

LSSNTSLKLRKLESLRR 


3913 


A 


362 


20 


APGRPEAKVPERSRESGSRRVRGPLLQLRPGRTS 
RPASGRGRGGAGGSYGKMRKPDSKIVLLGDMN 
VGKTSLLQRYMERRFPDTVSTVGGAFYLKQWRS 
YNISIWDTAGEAGAA 


3914 


A 


1 


7545 


PGIRVGITSQTGLSSNLQENCSKLAFISSHGTEKQ 

LQCMPMEGRGRASSSISDLQGKGFEKGTGEKHV 

PGVGSARHSPQASAGGSPWQRGKAQTRWLGKP 

DPGRKRRRGSPQEEGGLRVSAAARLLCSGANRC 

KVLVRQNSTPNTQQPAVHPSTPPSRPLPQAGRCL 

VAPLRPHPDWVAAKTLAKALRAPGKPWRLAAP 

SPLGDLGAPGLPGPSTAPRTLSVEEPGVECNQLC 

LYADVTDPVLCLGQKDPGVEGKHCEKEKISSSK 

ELKHVHAKSEPSKPARRLSESLHWDENKNESKI 

EREHKRRTSTPVIMEGVQEETDTRDVKRQVERSE 

ICTEEPQKQKSTLKNEKHLBCKDDSETPHLKSLLK 

KEVKSSKEKPEREKTPSEDKLSVKHKYKGDCMH 

KTGDETELHSSEKGLKVEENIQKQSQQTKLSSDD 

KTERKSKHRI^ERKLSVLGKIXjKPVSEYIIKTO^ 

VRKENNKKERRLSAEKTKAEHKSRRSSDSKIQK 

DSLGSKQHGITLQRRSESYSEDKCDMDSTNMDS 

l^KPEEWHKEKRRTKSLLEEKLVLKSKSKTQG 

KQVKWETELQEGATKQATTPKPDKEKNTEEND 

SEKQRKSKVEDKPFEETGVEPVLETASSSAHSTQ 

KDSSHRAKLPLAKEKYKSDKDSTSTRLERKLSD 

GHKSRSLKHSSKDIKKKDENKSDDKDGKEVDSS 

HEKARGNSSLMEKKLSRRLCENRRGSLSQEMAK 

GEEKLAANTLSTPSGSSLQRPKKSGDMTLIPEQEP 

MEDDSEPGVENVFEVSKTQDNRNNNSHQDIDSEN 

MKQKTSATVQKDELRTCTADSKATAPAYKPGR 

GTGVNSNSEKHADHRSTLTKKMHIQSAVSKMOT 

GEKEPIHRGTTEV>mDSETVHRMLLSAPSKSrDRV 

QKNLBCNTAAEEHVAQGDATLEHSTNLDSSPSLSS 

VTVVPLRESYDPDVIPLFDKRTVLEGSTASTSPAD 

HSALPNQSLTVRESEVLKTSDSKEGGEGFTVDTP 

AKASITSKRHIPEAHQATLLDGKQGKVIMPLGSK 

LTGVn^ENITKEGGLVDMAKKENDLNAEPNL 

KQTIKATVENGKKDGIAVDHVVGLNTEKYAETV 

KLKHKRSPGKVKDISIDVERR>JENSEVDTSAGSG 

SAPSVLHQRNGQTEDVATGPRRAEKTSVATSTE 

GKDKDVTLSPVKAGPATTTSSETRQSEVALPCTS 

ffiADEGLnGTHSRNNPLHVGAEASECTVFAAAEE 

GGAWTEGFAESEITLTSTKEGESGECAVAESED 

RAADLLA\nHLA.VKIEANVNSVVTEEKDDAVTSAG 

SEEKCDGSLSRDSETVEGTITFISEVESDGAVTSAG 

TEIRAGSISSEEVDGSQGNMMRMGPKJCETEGTV 

TCTGAEGRSDNFVICSVTGAGPREERMVTGAGV 

VLGDNDAPPGTSASQEGDGSVNDGTEGESAVTS 

TGITEDGEGPASCTGSEDSSEGFAISSESEENGESA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine CNTysteine, D^Aspartic Add, 
£=Glutamic Add, F^^Phenylalanine, G^lycine, H=Hi5tidine^ 
I=IsoIeucine, K=Lysine, L^Leudne, M=Mclliionine, 
N=Asparagine, P«Proline, Q==GlatamiDe, R^Arginine, S=Serine, 
T=Threonine, V«Valine, W^Tryptophan, Y=Tyrosine, 
X'=Unknown, *»Stop codon, A=possiblc nucleotide deletion, 
V=possible nucleotide insertion 








- S=l-w*' 


MDSTVAKEGTNVPLVAAGPCDDEGIVTSTGAKE 

EDEEGEDWTSTGRGNEIGHASTCTGLGEESEGV 

LICESAEGDSQIGTVVEHVEAEAGAAIMNANENN 

VDSMSGTEKGSKDTDICSSAKGIVESSVTSAVSG 

KDEVTPVPGGCEGPMTSAASDQSDSQLEKVEDT 

TISTGLVGGSYDVLVSGEVPECEVAHTSPSEKED 

EDnTSVENEECDGLMATTASGDITNQNSLAGGK 

NQGKVLnSTSTTNDYTPQVSAITDVEGGLSDALR 

TEENMEGTRVTTEEFEAPMPSAVSGDDSQLTASR 

SEEKDECAMISTSIGEEFELPISSATTIKCAESLQP 

VAAAVEERATGPVLISTADFEGPMPSAPPEAESP 

LASTSKEEKDECALISTSIAEECEASVSGVWESE 

NERAGTVMEEKDGSGnSTSSVEDCEGPVSSAVP 

QEEGDPSVTPAEEMGDTAMISTSTSEGCEAVMIG 

AVLQDEDRLTITRVEDLSDAAnSTSTAECMPISA 

SBDRHEENQLTADNPEGNGDLSATEVSKHKVPM 

PSLIAENNCRCPGPVRGGKEPGPVLAVSTEEGHN 

GPSVHKPSAGQGHPSAVCAEKEEKHGKECPEIGP 

FAGRGQKESTLHLINAEEKNVLLNSLQKEDKSPE 

TGTAGGSSTASYSAGRGLEGNANSPAHLRGPEQ 

TSGQTAKDSSVSSIRYLAAVNTGAIKADDMPPVQ . 

GTVAEHSFLPAEQQGSEDNLKTSTTKCITGQESKI 

APSHTMIPPATYSVALLAPKCEQDLTDCNDYSGK 

WTDQASAEKTGDDNSTRKSFPEEGDIMVTVSSE 

ENVCDTGl^ESPLNVLGGLKLKANLKMEAYVPS 

EEEKNGEELAPPESLCGGKPSGIAELQREPLLVNE 

SLNVENSGFRTNEEIHSESYNKGEISSGRKDNAE 

AISGHSVEADPKEVEEEERHMPKRKRKQHYLSSE 

DEPDDNPDVLDSRIETAQRQCPETEPHATKEENS 

RDLEELPKTSSETNSTTSRVMEEBa^EYSSSETTGE 

KPEQNDDDTIKSQE 


3915 


A 


I 


7545 


PGIRVGITSQTGLSSNLQENCSKLAHSSHGTEKQ 

LQCMPMEGRGRASSSISDLQGKGFEKGTGEKHV 

PGVGSARHSPQASAGGSPWQRGKAQHIWLGKP 

DPGRKRRRGSPQEEGGLRVSAAARLLCSGANRC 

KVLVRQNSTPNTQQPAVHPSTPPSRPLPQAGRCL 

VAPLRPHPDWVAAKTLAKALRAPGKPWRLAAP 

SPLGDLGAPGLPGPSTAPRTLSVEEPGVECNQLC 

LYADVTDPVLCLGQKDPGVEGKHCEKEKISSSK 

ELKHVHAKSEPSKPARRLSESLHWDENK>JESKI 

EREHKRRTSTPVIMEGVQEETDTRDVKRQVERSE 

ICTEEPQKQKSTLKNEKHLKKDDSETPHLKSLLK 

KEVKSSKEKPEREKTPSEDKLSVKHKYKGDCMH 

KTGDETELHSSEKGLKVEENIQKQSQQTKLSSDD 

KTERKSKHRNERKLSVLGKDGKPVSEYIIKTDEN 

VRKENNKKERRLSAEKTKAEHKSRRSSDSKIQK 

DSLGSKQHGITLQRRSESYSEDKCDMDSTNMDS 

NLKPEEVVHKEKRRTXSLLEEKLVLKSBCSKTQG 

KQVKVVETELQEGATKQATTPKPDKEKNTEEND 

SEKQRKSKVEDKPFEETGVEPVLETASSSAHSTQ 

KDSSHRAKLPLAKEKYKSDKDSTSTRLERKLSD 

GHKSRSLKHSSKDIKKKDENKSDDKDGKEVDSS 

HEKARGNSSLMEKKLSRRLCENRRGSLSQEMAK 

GEEKLAANTLSTPSGSSLQRPKKSGDMTLIPEQEP 

MEIDSEPGVENVFEVSKTQDNKNNNSHQDHDSEN 
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SEQlD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cystcine, D-Aspartic Acid, 
E^^Glotamic Acid, F=Phenylalanine, G=Glycine> H^Histidiney 
Msoleucine, K=Lysine, lr=Leucine, M=Methionine, 
N=Asparaginc, P^Proline, (^Glutamine, R^Arginine, S^Serine, 
T^Threonlne, V=Valinc, W«Tryptophan, Y'-Tyrosine, 
X=Unknown, *=Stop codon, /possible nucleotide deletion, 
\«possib1e nucleotide insertion 










MKQKTSATVQKDELRTCTADSKATAPAYKPGR 

GTGVNSNSEKHADHRSTLTKKMHIQSAVSKMNP 

GEKEPimGTTEVNIDSETVHRMLLSAPSENDRV 

QKNLKNTAAEEHVAQGDATLEHSTNLDSSPSLSS 

VTWPLRESYDPDVIPLFDKRTVLEGSTASTSPAD 

HSALPNQSLTVRESEVLKTSDSKEGGEGFTVDTP 

AKASrrSKRHIPEAHQATLLDGKQGKVIMPLGSK 

LTGVIVENENITKEGGLVDMAKKENDLNAEPNL 

KQTIKATVENGKKDGIAVDHVVGLNTEKYAETV 

KLKHKRSPGKVKDISIDVERR>JENSEVDTSAGSG 

SAPSVLHQKNGQTEDVATGPRRAEKTSVATSTE 

GKDKDVTLSPVKAGPArrrSSETRQSEVALPCTS 

ffiADEGLnGTHSRNNPLHVGAEASECTVFAAAEE 

GGAWTEGFAESETFLTSnCEGESGECAVAESED 

RAADLLAVHAVKIEANVNSWTEEKDDAVTSAG 

SEEKCDGSLSRDSEIVEGTITFISEVESDGAVTSAG 

TEIRAGSISSEEVDGSQGNMMRMGPKKETEGTV 

TCTGAEGRSDNFVICSVTGAGPREERMVTGAGV 

VLGDNDAPPGTSASQEGDGSVNDGTEGESAVTS 

TGITEDGEGPASCTGSEDSSEGFAISSESEENGESA 

MDSTVAKEGTNVPLVAAGPCDDEGIVTSTGAKE 

EDEEGEDWTSTGRGNEIGHASTCTGLGEESEGV 

LICESAEGDSQIGTWEHVEAEAGAAIMNANENN 

VDSMSGTEKGSKDTDICSSAKGIVESSVTSAVSG 

KDEVTPVPGGCEGPMTSAASDQSDSQLEKVEDT 

TISTGLVGGSYDVLVSGEVPECEVAHTSPSEKED 

EDIITSVENEECDGLMATTASGDITNQNSLAGGK 

NQGKVLnSTSTTNDYTPQVSAITDVEGGLSDALR 

TEENMEGTRVTTEEFEAPMPSAVSGDDSQLTASR 

SEEKDECAMISTSIGEEFELPISSATTIKCAESLQP 

VAAAVEERATGPVLISTADFEGPMPSAPPEAESP 

LASTSKEEKDECALISTSIAEECEASVSGVWESE 

NERAGTVMEEKDGSGIISTSSVEDCEGPVSSAVP 

QEEGDPSVTPAEEMGDTAMISTSTSEGCEAVMIG 

AVLQDEDRLTITRVEDLSDAAIISTSTAECMPISA 

SIDRHEENQLTADNPEGNGDLSATEVSKHKVPM 

PSLIAENNCRCPGPVRGGKEPGPVLAVSTEEGHN 

GPSVHKPSAGQGHPSAVCAEKEEKHGKECPEIGP 

FAGRGQKESTLHLINAEEKNVLLNSLQKEDKSPE 

TGTAGGSSTASYSAGRGLEGNANSPAHLRGPEQ 

TSGQTAKDSSVSSIRYLAAVNTGAIKADDMPPVQ 

GTVAEHSFLPAEQQGSEDNLKTSTTKCITGQESKI 

APSHTMIPPATYSVALLAPKCEQDLTIKNDYSGK 

WTDQASAEKTGDDNSTRKSFPEEGDIMVTVSSE 

ENVCDIGNEESPLNVLGGLKLKANLKMEAYVPS 

EEEKNGEE.APPESLCGGKPSGIAELQREPLLVNE 

SLNVENSGFRTNEEIHSESYNKGEISSGRKDNAE 

AISGHSVEADPKEVEEEERHMPKRKRKQHYLSSE 

DEPDDNPDVLDSREETAQRQCPETEPHATKEENS 

RDLEELPKTSSETNSTTSRVMEEKDEYSSSETTGE 

KPEONDDDTIKSQE 


3916 


A 


2 


773 


GPFGVLWPSAKPGPVTAVEARPPDASDPEGLRG 
GSPAPLLAPGPLDPSGRLHPAVSMMSYLKQPPYG 
MNGLGLAGPAMDLLHPSVGYPATPRKQRRERTT 
FTRSQlI)VLEALFAKTRYPDIFMREEVALKrNLPE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A'^Alanine C^K^ysteine, D=A5partic Acid, 
E>=GIutamic Acid, F=Phenylalanine, GMSlycine, H»Histidine, 
I^Isoieucine, K=Lysine, L^Leuclne, M-Methionine, 
N=»Asparagine, PNProIine, Q^Glutamine, R-Arginine, S=Scrine, 
T^Threoninc, V=VaIinc, W«Tryptophan, Y«Tyrosine, 
X'^UnknowOy *»Stop codoD,A=po$sible nucleotide deletion, 
V»possible nucleotide insertion 










SRVQVWFBCNRRAKCRQQQQSGSGTKSRPAKKK 
SSPVRESSGSESSGQFTPPAVSSSASSSSSASSSSA 
NPAAAAAAGLWAKLPCPLHIFSLCVFIEENRLV 
SGSWARDIRSVEETDKSGYR 


3917 


A 


2 


776 


R>m>GRRFRPPGIJlRI^GPHMPREPRGYRTO^ 

ALRELVPSSHAGSGASEHCQNNRQGSRQHRASR 

NVQAGGALAPPRHLCGLCSRLHFLKPDLSVRAA 

PSRAGASVMALRKELLKSIWYAFTALDVEKSGK 

VSKSQLRVLSHNLYTVLHIPHDPVALEEHFRDDD 

DGPVSSQGYMPYLNKYILDKVEEGAFVKEHFDE 

LCWTLTAKKNYRADSNGNSMLSNQDAFRLWCL 

FNFLSEDKYPLIMDPDEGEYLLKRYS 


3918 


A 


10 


318 


WQDLVCLGGSRAQBQKPLQQLWNAILLVAMLL 
CTGLWQAQRQASRQSQRELGGQVDLJFKRRVV 
RRLASLKTRRCRLSRAAQGLPDPGAETCAVCLD 
YFCNKQ 


3919 


A 


1 


204 


RVLTAINHTLKENLRKFYKGKKDKPLDLRPKKT 
RAMRRRLNMHEENLKTKKQHRKERLYPLRKYA 
AKA 


3920 


A 


1 


654 


RCCRSFVAPLQEKVVFGLFFLGAILCLSFSWLFHT 

VYCHSEGVSRLFSKLDYSGIALLIMGSFVPWLYY 

SFYCNPQPCFIYLiyiCVLGIAAIIVSQWDMFATPQ 

YRGVRAGVFLGLGLSGUPTLHYVISEGFLKAATI 

GQIGWLMLMASLYITGAALYAARJPERFFPGKCD 

IWFHSHQLFHIFWAGAFVHFHGVS>nLQEFRFMI 

GGGCSEEDAL 


3921 


.A c 


1587^ 


•.4j52. 


LERDjGCGGEEGGSyRSGAGPDSDPRGASSPPAG 
HRGTAASPRPVi^^ 

PAWRRVQWFSRVSGQVSTLMKATVLMRQPGRV 

QEIVGALRKGGGDRLQVISDFDMTLSRFAYNGK 

RCPSSYNBLDNSKUSEECRKELTALLHHYYPIEID 

PHRTVKEKLPHMVEWWTKAHNLLCQQKIQKFQI 

AQWRESNAMLREGYKTFFNTLYHNNIPLFIFSA 

GIGDILEEnRQMKXOOTNIHIVSNYMDFNEDGFL 

QGFKGQLIHTYNKNSSACENCGYFQQLEGKTNV 

ILLGDSIGDLTMADGVPGVQNILKIGFLNDKVBE 

RRERYMDSYDIVLEKDETLDWNGLLQHILCQG 

VQLEMQGP 


3922 


A . 


2 


164 


GKIYQRAFGGHSLKFGKGVQAHGCCCVADRTG 

HSILHTSYGRERPAPVHLRQDT 


3923 


A 


2 


3258 


EHATHAYAKLGTRRRHREVTVFVPTWQLKKNR 

RVRESHFLTKLHSLKMLSITPSQLENGKKITTYD 

YRFMVKLAEETOGIIVTNEQmiLMNSSKKLMVK 

DRLLPFTFAGNLFMVPDbPLGRDGPTLDEFLKKP 

NRLDTDIGNFLKVWKTLPPSSASVTELSDDADSG 

PLESLPNMEEVREEKEERQDEEQRQGQGTQKAA 

EEDDLDSSLASVFRVECPSLSEEELRCLSLHDPPD 

GALDIDLLPGAASPYLGIPWDGKAPCQQVLAHL 

AQLTIPSNFTALSFFMGFMDSHRDAIPDYEALVG 

PLHSLLKQKPDWQWDQEHEEAFLALKRALVSAL 

CLMAPNSQLPFRLEVTVSHVALTAILHQEHSGRK 

HPIAYTSKPLLPDEESQGPQSGGDSPYAVAWALK 

HFSRCIGDTPVVLDLSYASRTTADPEVREGRRVS 

KAWLIRWSLLVQDKGKRALELALLQGLLGENRL 

LTPAASMPRFFQVLPPFSDLSTFVCIHMSGYCFYR 



468 



wo 01/57190 PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alaaine C=Cysteine, ]>='Aspartic Acid, 
E^^GIutamic Acicx, Jrnenyiaianine, \a^=\>iycinef u— nistiatne, 
IHUoleucine, K=Lysine, L^Leudne, M^Methioaine, 
N^Asparagine, PNProIine, Q^Glutamine, R»Arginine, S=Serine, 
T=Tlirconlne, V=«VaIine, W«Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^^possible nucleotide deletion, 
V»possible nucleotide insertion 










EDEWCAGFGLYVLSPTSPPVSLSFSCSPYTPTYA 

HLAAVACGLERFGQSPLPWFLTHCNWIFSLLWE 

LLPLWRARGFLSSDGAPLPHPSLLSYIISLTSGLSS 

LPFIYRTSYRGSLFAVTVDTLAKQGAQGGGQWW 

SLPKDVPAPTVSPHAMGKRPNLLALQLSDSTLAD 

HAilLQAGQKLSGSSPFSSAFNSLSLDKESGLLMF 

KGDKKPRVWVVPTQLRRDLIFSVHDIPLGAHQR 

PEETYKKLRLLGWWPGMQEHVKDYCRSCLFCIP 

RNLIGSELKVIESPWPLRSTAPWSNLQIEWGPVT 

ISEEGHKHVLIVADPNTRWVEAFPLKPYTHTAVA 

QVLLQHVFARWGVPVRLEAAQGPQFARHVLVS 

CGLALGAQVASLSRDLQFPCLTSSGAYWEFKKA 

LKEFIFLHGKKWAASLPLLHLAFRASSTDATPFK 

VLTGGESRLTEPLWWEMSSANIEGLKMDVFLLQ 

LVGELLELHWRVADKASEKAENRRFKRESQEKE 

WNVGDQVLLLSLPRNGSSAKWVGPFYIGDRLSL 

SLYRIWGFPTPEKLGCIYPSSLMKAFAKSGTPLSF 

KVLEQ 


3924 


A 


1 


1826 


MGSVTVRYFCYGCLFTSATWTVLLFVYFNFSEV . 

TQPLKNVPVKGSGPHGPSPKKFYPRFTRGPSRVL 

EPQFKANODDVIDSRVEDPEEGHLKFSSELGMIF 

NERDQELRDLGYQKHAFNMLISDRLGYHRDVPD 

TRNAACKEKFYPPDLPAASWICFYNEAFSALLR 

TVHSVIDRTPAHLLHEIILVDDDSDFDDLKGELDE 

YVQKYLPGKIKXORNTBCREGLIRGRMIGAAHATG 

EVLVFLDSHCEVNVMWLQPLLAAIREDRHTVGC 

PVIDIISADTLAYSSSPVVRGGFNWGLHFKWDLV 

PLSELGRAEGATAPIKSPTMAGGLFAMNRQYFH 

ELGQYDSGMDIWGGENLEISFRIWMCGGKLFIIP 

CSRVGHIFRKRRPYGSPEGQDmTHNSLRLAHV 

WLDEYKEQYFSLRPDLKTKSYGNISERVELRKKL 

GCKSFKWYLDNVYPEMQISGSHAKPQQPIFVNR 

GPKRPKVLQRGRLYHLQTNKCLVAQGRPSQKG 

GLWLKACDYSDPNQIWIYNEEHELVLNSLLCLD 

MSETRSSDPPRLMKCHGSGGSQQWTFGKN>IRLY 

QVSVGQCLRAVDPLGQKGSVAMAICDGSSSQQ 

WHLEG 


3925 


A 


5386 


2897 


VRWNSKTECYLSIQTQENFPANLNELVNCrVISSL 

VTTQRKLKAMSLLGSKNQLARAVLNPNPMDFCT 

KDLLTTTSERIIAYLRDFNEDQKKAIETAYAMVK 

HSPSVAKICLfflGPPGTGKSKTIVGLLYRLLTENQ 

RKGHSDENSNAKIKQNRVLVCAPSNAAVDELM 

KKIILEFKEKCKDKKNPLGNCGDINLVRLGPEKSI 

NSE\a.KFSLDSQVNHRMKKELPSHVQAMHKRK 

EFLDYQLDELSRQRALCRGGREIQRQELDENISK 

VSKERQELASKIKEVQGRPQKTQSIIILESHIICCT 

LSTSGGLLLESAFRGQGGVPFSCVIVDEAGQSCEI 

ETLTPLIHRCNKLILVGDPKQLPPTVISMKAQEYG 

YDQSMMARFCRLLEENVEHNMISRLPILQLTVQ 

YRMHPDICLFPSNYVYNRNLKTNRQTEAIRCSSD 

WPFQPYLVFDVGDGSERRDNDSYINVQEIKLVM 

EIIKLIKDKRKDVSFRNIGnTHYKAQKTMIQKDL 

DKEFDRKGPAEVDTVDAFQGRQKDCVIVTCVRA 

NSIQGSIGFLASLQRLNVTITRAKYSLFILGHLRTL 

MENQHWNQLIQDAQBCRGAIIKTCDKNYRHDAV 



469 



wo 01/57190 



PCT/USOl/04098 



SEQ ID 
NO: 


Method 


Predicted 

beginning 

nndeofide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nncleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A-Ala nine C=Cysteine, D=Aspartic Acid, 
E^Glutamic Acid, it^Fnenyialantne, u^Kjiycinei tiBjrustiuine, 
Islsolendne, K^Lysine, L^Leudne, M^Metliionine^ 
N^Asparagine, P=Proline, Q^Glntaraine, R»Arginitte, S^erine, 
T^Threonine, V^Valine, W*==Tryptophan, Y=Tyrosine, 
X»Unknown, *«5top codon, A^possible nndeotide ddetion, 
V=possible nudeotide insertion 










KIIJKLKPVLQRSLTHPPTIAPEGSRPQGGLPSSKl. 

DSGFAKTSVAASLYHTPSDSKEITLTVTSKDPERP 

PVHDQLQDPRLLKRMGEEVKGGIFLWDPQPSSPQ 

HPGATPPTGEPGFPWHQDLSHVQQPAAWAAL 

SSHKPPVRGEPPAASPEASTCQSKCDDPEEELCH 

RREARAFSEGEQEKCGSETHHTRRNSRWDKRTL 

EQEDSSSKKRKLL 


3926 


A 


99 


284 


MPREDRATWKSNYFLKEQLLDDYPKRFIVGANN 
VGSKQMQQIRMSLRGKAVVLMGKNTMMR 


3927 


A 


542 


2 


AHLLMLNLAL\TDLL\YLTSLPFLIHYYASGENW1 
FGDFMCKFIRFSFHFNLYSSILFLTCFSIFRYCVnH 
PMSCFSIHKTRCAWACAVVWnSLVAVn>MTFLI 
TSTNRTNRSACLDLTSSDELNTIKWYNLILTA\LL 
CLPLVIVTLCYTTIIHTLTHGHANXDSCLKQKARR 
LTILLL 


3928 


A 


1 


1516 ' 

, -v. • •. 


geeavgggaegggfgvgaqgraggrgveagr 

mrlsktlvdmdmadysaaldpayttlefenvq 

vltmgndtspsegtnlnapnslgvsalcaicgdr 

atgkhygasscdgckgffrrsvrknhmyscrfs 

rqcvvdkdkrnqcrycrlkkcfragmkkeav 

qnerdristrrssyedsslpsinallqaevlsrqit 

spvsgingdirakkiasiadvcesmkeqllvlve 

wakyipgfcelplddqgallrahagehlllgat 

krsmvfkdvlllgndyivprhcpelaemsrvsir 

ildelvlpfqelqiddneyaylkauffdpdakgl 

sdpgkikrlrsqvqvsledyindrqydsrgrfge 

•lllllptlqsitwqmffiqiqfiklfgmakidnllq 

BmllggspsdAphahhplhphlmqehmgtnviv 

ahttmpthlsngqmcewprprgqaatpetpqpsp 

pgasgsepykllpgavatwkplsaipqptitkqe 

VI 


3929 


A 


1 


2782 


rvlslesplebcdprvlgaqsvprgralkglsplg 

ldsafrlfpdpragpwntavlssgmepetalwg 

pdlqgpeqspndahrgaeseneeesprqessgeei 

imgdpaqspeskdstemslerssqdpsvpqnpptp 

lghsnpldhqpldppapewptpsdwtxaceas 

wqwgalttwnsppvvpanepslrelvqgrpag 

aekpyicnecgksfsqwskllrhqrhtgerpnt 

csecgksftqsshlvqhqrthtgekpykcpdcg 

kcfswssnlvqhqrthtgekpykctecekaftq 

stnlikhqrshtgekpykcgecrrafyrssdliq 

hqathtgekpykcpecgkrfgqnhnllkhqkih 

agekpyrctecgksfiqsseltqhqrthtgekpy 

eclecgksfghsstlikhqrthlredpfxcpvcg 

ktftlsatllrhqrthtgerpykcpecgksfsvs 

sl^inhqrihrgerpyicadcgksfimsstlirhq 

rihtgekpykcsdcgksfirsshliqhrrthtgek 

pykcpecgksfsqssnlmrv^thmdenlfvcsd 

cgkafleaheleqhrvihergktparraqgdsl 

lglgdpslltpppgakphkclvcgkgfndegifm 

qhqrihigenpyknadgliahaapkppqlrsprl 

pfrgnsypgaaegraeapgqplkppegqegfsqr 

rgllssktyicshcgesfldrsvllqhqlthgne 

kpflfpdyriglgegagpspflsgkpfkcpeckqs 

fglsselllhqkvhaggksshkspelgksssvll 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine 0=Cysteine, D^Aspartic Acid, 
t^Glutamic Acid, F=PbenylaIaninc, G^Glycine, H=Histjdine, 
I=Isoleudne, K=Lysinc, L=Leucine, M^Methionine, 
N^^Asparagine, P^Prolinc, Q^Glutamine, R=Argininc S=Serine, 
T'-Threonine, V^Valinc, W=Tryptophan, Y-Tyrosine, 
X-Unknown, *=Stop codon, /=possible nucleotide deletion, 
\Fpossible ancleotide insertion 










EHLRSPLGARPYRCSDCRASFLDRVALTRHQETH 
TOFlirPPNPFDPPPFAVTLSTTKDEGEGETPTPTESS 
SHGEGQNPKTLVEEKPYLCPECGAGFTEVAALLL 
HRSCHPGVSL 


3930 


A 




Mli 


TirTOFTHTYTSFHTFFPFLOGFGNLPICMAKl'DLSLS 
HQPDKKGVPSDFILPISDVRASIGAGFIYPLVGTG 
SRESPLWL 


3931 


A 


lO 




T^T^nnFT srWPAFTVLGEARGDOVDWSKLYRDT 
GLVmSRKPRASSPFSlWHPSTPKRRGRGKHPLI 
PGPEALSKFPRQPIREKGPVKEVPGTKGSP 


3932 


A 


16 


305 


KRRDFLSCWPAFTVLGEARGDQVDWSKLYRDT 
GLVKMSRKPRASSPFSNNHPSTPKRRGRGKIIPLI 
PGPEALSKFPRQPIREKGPVKEVPGTKGSP 


3933 


A 


1 


1546 


STHASEHWDSALQLAICHLAPDQIPFISKEYAIQLE 

FAGDYVNALAHYEKGITGDNKEHDEACLAGVA 

QMSIRMGDIRRGVNQALKHPSRVLKRDCGAILE 

NMKQFSEAAQLYEKGLYYDKAASVYIRSKNWA 

KVGDLLPHVSSPKIHLQYAKAKEADGRYKEAVV 

AYENAKQWQSVIRIYLDHLNNPEKAVNIVRETQ 

SLDGAKMVARFFLQLGDYGSAIQFLVMSKCNNE 

AFTLAQQHNKMBIYADnGSEDrmEDYQSIALY 

FEGEKRYLQAGKFFLLCGQYSRALKHFLKCPSSE 

D>rVAIEMAIETVGQAKDELLTNQLIDHLLGEND 

n\AO^^T\ tttjt WTAT VOVRFA AOTATTTAT^FF 

(jjv)r|si JAN. I JLl^K-U I lYl/VL»JVv^ I ivC'/vrVV^ i rU,ii/^J\x:«Xj 

QSAGNYRNAHDVLFSMYAELKSQKIKIPSEMAT 

NLMILHSYILVKIHVKNGDHMKGARMLIRVANN 

ISBCFPSfflVPILTSTVIECHRAGLKNSAFSFAAML 

MRPEYR^KIDAKYKKKffiGMVRkPDISEffiEATTP 

CPFCKFLLPESELL 


3934 


A 


334 


1268 


PTRRPILPLTSPKAISVPSPLQGKQHILVKSCLSVS 

GIGGFLVSLSSRMKLQTLAVSVTALKFWSAYVP 

CQTQDRDALRLTLEQIDLmRMCASYSELELVTS 

AKALNDTQKLACLIGVEGGHSLDNSLSILRtFYM 

T r'\^"DVT TT TTJTPTsJTPWAF<5^AK'GVWSFYNTnITSGL 

TDFGEKVVAEMNRLGMMVDLSHVSDAVARRAL 

EVSQAPVIFSHSAARGVCNSARNVPDDILQLLEE 

ERWAFVMVSLFHGELIQWQPIRPMCSTVADHFD 

HIKAV\IGSKFIGIGGDYDGAGKYRKKTTCKAPW 

RTSSRMSS 


3935 


A 


1 


883 


HETTPAWQSVLLERGWNKFDKQEQNAEDWNL 

YWRTSSFRMTEHNSVKPWQQLNHHPGTTKLTR 

KDCLAKHLKHMRRMYGTSLYQFIPLTFVMPNDY 

towafyfOFT^OMT rrTKHSYWICKPAELSRGRG 

ILIFSDFKDFIFDDMYIVQKYISNPLLIGRYKCDLR 

lYVCVTGFKPLTIYVYQEGLVRFATEBCFDLSNLQ 

NNYAHLTNSSINKSGASYEKIKEVIGHGCKWTLS 

RFFSYLRSWDVDDLLLWKKiroUVrVILTILAIAPS 

VPFAANCFELFGFDILIDDNEFHRTG 


3936 


A 


203 


441 


HLAHSLGPLPKHYQYCVRYLYYQVTKDVIKEFA 
DDGVKYLELRSTPRRENATGMTKKTYVESILEGI 
KQSKQENLDIDV 
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SEQIDNO: 


Position of end of 
Signal in Amino Acid 
Sequence 


MaxS (MAXIMUM 
SCORE) 


Means (Mean Score) 


1 


19 


0.930 


0.680 


2 


24 


0.964 


0.863 


3 


21 


0.990 


0.901 


4 


19 


0.981 


0.942 


5 


22 


0.991 


0.928 


6 


21 


0.956 


0.843 


8 


22 


0.913 


0.718 


9 


17 


0.997 


0.969 


11 


19 


0.930 


0.680 


13 


36 


0.983 


0.863 


14 


28 


0.935 


0.839 


15 


21 


0.997 


0.955 


16 


16 


0.983 


0.944 


17 


18 


0.989 


0.884 


19 


49 


0.996 


0.719 


20 


28 


0.972 


0.920 


21 


23 


0.954 


0.905 


22 


46 


0.955 


0.568 


23 


26 


0.942 


0.654 


24 


19 


0.979 


0.941 


25 


34 


0.884 


0.565 


26 


33 


0.934 


0.584 


27 


17 


0.975 


0.914 


28 


18 


0.980 


0.934 


29 


23 


0.928 . .. 


.0.718 


30 


26 


0.978 


0.885 


32 


20 


0.946 


0.719 


33 


29 * •*' *' . - 


•0.933 '^^r-^i^t'^^: 


if0l671 - ' 


35 


25 


0S96 • - 


0.920 


36 


26 


0.903 


0.579 


40 


19 


0.981 


0,942 


47 


25 


0.971 


0.909 


53 


22 


0.991 


0.928 


55 


24 


0.960 


0.808 


60 


19 


0.986 


0.967 


78 


22 


0.913 


0.718 


86 


20 


0.883 


0.555 


87 


24 


0.982 


0.889 


88 


17 


0.997 


0.969 


115 


19 


0.930 


0.680 


134 


36 


0.983 


0.863 


136 


17 


0.913 


0.696 


137 


19 


0.958 


0.905 


140 


28 


0.935 


0.839 


143 


32 


0.914 


0.740 


153 


21 


0.997 


0.955 


154 


25 


0.913 


0.583 


155 


29 


0.972 


0.857 


169 


30 


0.977 


0.817 


170 


30 


0.977 


0.819 


171 


30 


0.977 


0.819 


175 


47 


0.926 


0.606 


176 


30 


0.968 


0.872 


177 


22 


0.957 


0.791 


192 


43 


0,930 


0.678 
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SEQ ID NO: 


Position of end of 


TV/TrtvC /A/r A YTMTTlVf 
JVlBXo ^lyiAAllVlUlVl 


1\/rp5inS nVIean Score^ 

ITXC Alio 1 XTA.W21U %y%^VA 


Signal in Amino Acid 








Sequence 






195 


ID 


JO 


0.860 


ZVZ 


2J 




0.871 


OA'S 


24 




0.870 


OAT 

20/ 


23 




0.905 


224 


HO 


n OSS 


0.568 




20 


0 047 


0.654 


ooo 

22 o 


45 


n 0^1 


0.839 


O^ 1 

231 


25 


0 004 


0.937 


0*20 

2j2 


2o 




0.896 




10 


\}.y iz? 


0.942 


2jj 


10 

ly 


0 07Q 


0.941 


o^c 
23 o 


2U 


n 0R7 


0.943 


O /I >l 

244 


23 


0 090 


0.683 


OCA 

250 


34 


n RR4 


0.565 


25o 


33 


n o'^4 


0.584 


258 


25 


n 0^4 


0.729 


O CA 

259 


22 


n 0^0 


0.871 


2o4 


1 Q 

ly 


n 0S9 


0.753 


2o5 


1*7 


f) 07S 


0.914 


266 


1 / 


A 07S 
U.7/ J 


0.914 


271 


OQ 

23 


n 074 


0.884 


274 


13 


n 071 


0.834 


OTC 

275 


1 Q 




0.934 


278 


1^ 

5Z. 


n os$i 
u.yjo 


0.668 


280 


o>i 
24 


n 0^^ 


0.881 


281 


o>i 
24 


A 0^^ 


0.881 


286 


oo 
23 


n 09R 


0.718 


291 


35 


n 001 


0.824 


293 


2 / 




0.806 


OA>l 

294 


23. • - : ■'• 


A 0S91 . - ' 


0,827 


301 




-A 078 ■ ' 
V.y/o 


0.885 ^ 


316 


20 


A 04f^ 


0.719 


^OA 

320 


Oft 
2o 


A 07J? 


0.726 


00*7 

327 


29 




0.671 


331 


/I Q 

4o 


A OA*^ 


0.571 


AC 

345 


25 


A QQ^ 


0.920 


349 


2o 


A OA*^ 


0.579 


act 
351 


24 


A 0S1 


0.876 


352 


lo 


A Q44 


0.716 


353 


32 


A 009 


0,854 


354 


OT 
2/ 


A 04S 


0.817 


355 


1 

10 


A 099 


0.716 


350 


13 


0 0S9 


0.818 


357 


23 


A 986 


0.878 


35o 




A 904 


0.671 


359 


10 


A 988 


0.951 


300 


1 c 
15 


A 981 


0.938 


301 




A 944 


0.716 


302 


21 


A 984 


0.869 


3o3 


A.(\ 
4U 


A 070 


0.813 


364 


1 Q 
lo 




0.693 


365 


22 


0.962 


0.908 


366 


22 


0.961 


0.827 


367 


44 


0.941 


0.624 


368 


20 


0.952 


0.791 


369 


22 


0.949 


0.840 


370 


28 


0.957 


0.682 
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SEQ ID NO: 


Position of end of 


MaxS (MAXIMUM 


MeanS (Mean Score) 




Signal in Amino Acid 


SCORE) 






Sequence 






372 


28 


0.974 


0.894 


373 


19 


0.972 


0.947 


374 


29 


0.968 


0.785 


375 


19 


0.949 


0.897 


377 


23 


0.962 


0.910 


378 


31 


0.974 


0.895 


379 


26 


0.969 


0.939 


380 


27 


0.945 


0.817 


383 


27 


0.945 


0.817 


384 


25 


0.992 


0.877 


385 


32 


0.983 


0.825 


386 


44 


0.924 


0.564 


387' 


26 


0.971 


0.894 


388 


19 


0.989 


0.862 


389 


24 


0.990 


0.947 


390 


34 


0.942 


0.635 


391 


16 


0.922 


0.716 


394 


19 


0.987 


0.970 


398 


36 


0.992 


0.866 


404 


13 


0.959 


0.818 


417 


23 


0.986 


0.878 


421 


19 


0.904 


0.671 


425 


28 


0.971 


0.717 


431 


16 


0.988 


0.951 


452 


18 


0.944 


0.716 


459 


21 


0.991 


0.902 


,468 


21 


0.984 


0.869 


478 


40 


0.979 


0.813 


486 


18 


0.883 


0.693 




22. p'. ^Vjf' - '"~l-±r>* 


0.962.- 


0.908 


501"* 


19 - • ~ 


0.962 


0.877 


514 


44 


0.941 


0.624 


529 


20 


0.952 


0.791 


533 


39 


0.914 


0.719 


548 


28 


0.957 


0.682 


561 


28 


0.974 


0.894 


562 


28 


0.974 


0.893 


564 


18 


0.949 


0.806 


576 


19 


0.972 


0.947 


584 


29 


0.968 


0.785 


585 


28 


0.973 


0.810 


591 


19 


0.949 


0.897 


592 


24 


0.991 


0.954 


594 


20 


0.985 


0.959 


595 


20 


0.985 


0.959 


612 


23 


0.962 


0.910 


619 


31 


0.974 


0.895 


621 


15 


0.959 


0.795 


633 


26 


0.969 


0.939 


640 


20 


0.949 


0.842 


645 


25 


0.911 


0.759 




25 


0.992 


0.877 


691 


32 


0.983 


0.825 


698 


44 


0.924 


0.564 


700 


19 


0.982 


0.941 


710 


26 


0.971 


0.894 


714 


23 


0.965 


0.907 
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SEQ ID NO; 


Position of end of 
Signal In Amino Acid 
Sequence 


MaxS (MAXIMUM 
SCORE) 


MeanS (Mean Score) 


718 


19 


0.989 




725 


21 


0.976 


U.oDJ. 


728 


33 


0.961 




734 


25 


0.963 


U.ODU 


741 


34 


0.942 


KJ.Ojj 


744 


19 


A ACA 

0.959 


f\ Q'JA 


747 


16 


0.922 


U, / 10 


756 


26 


A A'7^ 

0.973 


U.oO*f 


161 


22 


0.986 




768 


27 


A A1 ^ 

0.916 


V. / Jo 


769 


19 


0.987 




770 


22 


0.981 




771 


34 


0.993 


A CQ1 


773 


20 


0.968 




774 


21 


0.971 


A OA< 


778 


22 


0.986 


A QAI 


779 


32 


0.973 


A 


781 


23 


0.950 


A c<:7 
U.oD / 


785 


27 


0.916 


A T^Q 
U./DO • 


786 


27 


0.916 


A 


788 


22 


0.981 




793 


22 


0.986 


A QA^ 


794 


39 


0.892 




797 


27 


0.965 


A QAI 


810 


22 


0.981 


A Oil 


823 


34 


0.993 


A QQ1 


825 


17 


0.962 


U./ /o 


837 


20 


0.968 


A A 


844 


25 


0.984 


A O^l 


845: \ - 


17 


0.919 


U. /UO . 


846 


21 * ' 


0.971 " 


u.y4j 


847 


21 


0.971 




890 


22 


0.986 




893 


24 


0.971 


A Q« 


894 


24 


0.971 




896 


32 


0.973 


U.o40 


899 


31 


0.982 


A Q 1 T 


922 


15 


0.882 


A TA/J 


924 


21 


0.975 


A QA C 


oo< 

yzj 


91 
^1 


0.927 


0.661 


933 


20 


0.967 


0.906 


960 


20 


0.967 


0.906 


967 


38 


0.970 


0.784 


968 


47 


0.970 


0.557 


972 


36 


0.945 


0.775 



TABLES 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location 

corresponding to 
first amino acid 
residue of 
peptide sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue 
of peptide 
sequence 


Amino acid sequence (A-Alanine OCysteme, D-Aspartic 
Acid, E^Glutamic Acid, F=Pheny!alanine, G=GIycine, 
H=Histidine, I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=ProIine, Q^Glutamine, 
R=Arginine, S=Serine, T=Tlireonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible nucleotide 
insertion 


3955 


A 


235 


1272 


GPREVLAASSLADGSEEQVMAVALVRERDLSFPG 
VGDAWNPTRWHLPAQPEMLYEGGEGRMETLK 
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SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location 

corresponding to 
first amino acid 
residue of 
peptide sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C==Cysteine, I>=Aspartic 
AClOy III — oiiiiaiiiic /\ciU) r — jrncnjittioUiiiCf vj — oiyciiiC) 
H=Histidine, I=Isoleucine, K=Ly5ine, L^Leucine, 
M-Methionine, N=Asparagine, P==Proline, Q=Glutamine, 
R=Arginine, S=Serinc, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop codon, 
/==possibIe nucleotide deletion, \=possible nucleotide 
Insertion 










DKTLQELEELQNDSEAIDQLALESPEVQDLQLERE 

MALATNRSLAERNLEFQGPLEISRSNLSDRYQELR 

KLVERCQEQKAKLEKFSSALQPGTLLDLLQVEGM 

KffiEESEAMAEKFLEGEWLETFLEOTSSMRMLSH 

LRRVRVEKLQEWRKPRASQELAGDAPPPRSPPP 

V/PPSPPGNTPCG*RAAAATISHASLPFALQPIPQPA 

CGPHCPWSPATGPFPSSVPALLLQRASGPHLPGSP 

AWTQGCCGLLLVPTEEHAAPPYGFPPPPGPAWPG 

Y 


3956 


A 


821 


385 


SICADRTERVGIFFYIPAGTTDEADVTHP*EGHSYL 

SNHAGIQRSSRP/SHYQGE/WHDNCFTADELQLLT 

YQLCHTYVRCTRSVSIPAPAYYAHLVAFRARYHL 

VDKEHDSAEGSHVSGQSNGRDPQALAKAVQfflQ 

DTLRTMYFA 


3957 


A 


4621 

■ ^ - - 1. 


240 


ELISTFKLLLEKKRSEVMKMKKRYEVGLEKLDSA 
SSQVATMQMELEALHPQLKVASKEVDEMMIMIE 
KESVEVAKTEKIVKADETIANEQAMASKAIKDEC 
DADLAGALPILESALAALDTLTAQDITWKSMKSP 
PAGVKLVMEAICILKGIKADKIPDPTGSGKKIEDF 
WGPAKRLLGDMRFLQSLHEYDKDNIPPAYMNIIR 
KNYIPNPDFWEKIRNASTAAEGLCKWVIAMDSY 
DKVAKIVAPKKIKLAAAEGELKIAMDGLRKKQA 
ALKEVQDKLARLQDTLELNKQKKADLENQVDLC 
SKKLERAEQLIGGLGGEKTRWSHTALELGQLYIN 
LTGDILISSGVVAYLGAFTSTYRQNQTKEWTTLCK 
: GRBIPCSDD.CSLMGTLGEAVTIRTWNIAGLPSDSF 

ii5J©NG^^^M^ 

NSLYVIKLSEPDYVRTLENCIQFGTPVLLENVGEE 

LDPILEPLLLKQTFKQGGSTCIRLGDSHEYAPDFR 

FYITTKLRNPHYLPETSVKVTLLNFMITPEGMQDQ 

LLGIVVAQERPDLEEEKQALILQGAENKRQLKEIE 

DKILEVLSSSEGNILEDETADOLSSSKALANEISQK 

QEVAEETEKKroTTRMGYRPIAIHSSILFFSLADLA 

NIEPMYQYSLTWFINLFDLSffiNSEKSEILAKRLQIL 

KDHFTYSLYVNVCRSLFEKDKLLFSFCLTINLLLH 

ERAINKAEWRFLLTGGIGLDNPYANPCTWLPQKS 

WDEICRLDDLPAFKTIRREFMRLKDGWKKVYDSL 

EPHHEVFPEEWEDKANEFQRMLIIRCLRPDKVIPM 

LQEFnNRLGRAFBEPPPFDLAKAFGDSNCCAPLIFV 

LSPGADPMAALLKFADDQGYGGSKLSSLSLGQGQ 

GPIAMKMLEKAVKJEGTWVLQNCHLATSWMPT 

LEKVCEELSPESTHPDFRMWLTSYPSPNFPVSVLQ 

NGVKMTNEAPKGLRANHRSYLMDPISDPEFFGSC 

KECPEEFKKLLYGLCFFHALVQERRKFGPLWWNIP 

YEFNETDLRISVQQLHMFLNQYEELPYEALRYMT 

GECNYGGRVTDDWDRRTLRSILNKJFFNPELVENS 

DYKFDSSGIYFVPPSGDHKSYIEYTKTLPLTPAPEI 

FGMNANADITKDQSETQLLFDNILLTQSRSAGAG 

AKSSDEVVNEVASDILGKLPNNFDIEAAMRRYPT 

TYTQSMNTVLVQEMGRFNKLLKTIRDSCVNIQKA 

IKGLAVMSTDLEEWSSILNVKIPEMWMGKSYPS 

LKPLGSYVNDFLARLKTLQQWYEVGPPPVFWLSG 

FFFTQAFLTGAQQNYARKYTIPIDLLGFDYEVMED 

KEYKHPPEDGVFIHGLFLDGASWNRKIKKLAESH 

PKILYDTVPVMWLKPCKRADIPKRPSYVAPLYKT 
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SEQ 

ill 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location 

corresponding to 
first amino acid 
residue of 
peptide sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=AIanine OCysteine, D=Aspartic 
Acid, E=Glutamic Acid, F=Pheny!aIanine, G=Glycine, 
H=Histidine, Msoleucine, K=Lysine, l^Leucine, 
M=Methionine, N=Asparagine, P=Proline, Q^GIutamine, 
R=Arginine, S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X==Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible nucleotide 
insertion 










SERRGVLSTTGHSTNFVIA\MTLPSDQFKEHW1GR 
GVALLCQLNS 


3958 


A 


35 


529 


GADMAKSKI^TTHNQSRKWHRIWIKXPLSQRV^ 

SLKGVDPKFLGNMCFTKKHKKKGLKKMQADSA 

KAVSTCAKAIEALVKPKEVKPKIPKGVSCELN*LA 

YIAYPKFWTCACACIAKGLRLCQPKAKAQDQTK 

AQVQIKAQAAAPASVPTQAPKGAQAPTKASG 


3959 


A 


1883 


763 


LLVLLLRTNLLIASSTRISRATLTCSPPGIPVDPRVR 

PRVRSHLVMYLGITTGSLHKAVVSGDSSAHLVEEI 

QLFPDPEPVRNLQLAPTQGAVFVGFSGGVWRVPR 

ANCSVYESCVDCVLARDPHCAWDPESRTCCLLSA 

PNLNSWKQDMERGNPEWACASGPMSRSLRPQSR 

PQHKEVLAVPNSILELPCPHLSALASYYWSHGPAA 

VPEASSTVYNGSLLLIVQDGVGGLYQCWATENGF 

SYPVISYWVDSQDQTLALDPELAGIPREHVKVPLT 

RVSGGAALAAQQSYWPHFVTVTVLFALVLSGALI 

ILVASPLRALRARGKVQGCETLRPGEKAPLSREQH 

T O^PKECRTSASDVDADNNCLGTEVA 


3960 


A 


1 


481 


SYAAPSLFVKSLYWALAFMAVLLAVSGWIVVLA 
SRAGARCQQCPPGWVLSEEHCYYFSAEAQAWEA 
SQAFCSAYHATLPLLSHTQDFLGRYPVSRHSWVG 
AWRGPQGWHWIDEAPLPPQLLPEDGEDNLDINCG 
ALEEGTLVAANCSTPRPWVCAKGTQ 



TABLE 9 



SEQ ID NO: 


Accession 
Number 


Species 


Description 


Smith 

Waterman 

Score 


% Idenity 


3937 


Y27700 


Homo salens 


Human secreted 
protein encoded by 
gene No. 12. 


193 


25 


3938 


AF093097 


Homo sapiens 


putative RNA-bindmg 
protein Q99 


3881 


84 


3939 


AB012308 


Anthocidaris 
crassispina 


B2HC 


4169 


74 


3940 


U10248 


Homo sapiens 


ribosomal protein L29 


787 


95 


3941 


Y99418 


Homo sapiens 


Human PR01317 
(UNQ783) amino acid 
sequence SEQ ID 
NO:277. 


4031 


100 


3942 


AL023516 


Gallus gallus 


B locus C type Lectin 


198 


35 



TABLE 10 



SEQ ID 
NO: 


Accession No. 


Description 


Results* 


3937 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.168e-l 1 209- 
224 


3942 


BL00615 


C-type lectin domain proteins. 


BL00615A 16.68 6.400e.ll 37- 
55 



sequence 
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TABLE 11 



SEQID 
NO: 


PFAM Name 


Description 


P-Value 


PFAM 
Score 


3938 


Piwi 


Piwi domain 


2.6e-150 


512.7 


3940 


Ribosomal L29e 


Ribosomal L29e protein family 


2.3e-19 


77.8 


3941 


Sema 


Sema domain 


4e-181 


615.1 


3942 


lectin c 


Lectin C-type domain 


0.086 


-7.1 



TABLE 12 



SEQmNO: 


Position of end of 
Signal in Amino Acid 
Sequence 


MaxS (Maximum Score) 


Means (Mean Score) 


3941 


31 


0.985 


0.926 


3942 


21 


0.974 


0.894 


. TABLE 13 



SEQ IB NO: 


SEQID 


SEQ ID NO: 


SEQ ID NO: 


Priority Docket 


SEQ ID NO: in 


of fuU length 


NO: of full 


of contig 


of contig 


number 


USSN 09/496,914 


nucleotide 


length 


nucleotide 


peptide 


corresponding SEQ 




sequence 


peptide 


sequence 


sequence 


ID NO: in priority 






sequence 






application 




3937 


3943 


3949 


3955 


787CIP2G 1 


787 3587 


3938 


3944 


3950 


3956 


787CIP2G 2 


787 3813 


39J39 


3945 


3951 


3957 


787CIP2GJ 


787 4462 


3940 


3946 


3952 


3958 


787CIP2G 4 


787 4887 


J9AI 


3947 . . 


3953 


3959 


787CIP2G 5 


787 5794 




3948;, - 


3954-, 


3960 


787CIP2G 6 


-787 8743 



TABLE 14 



TISSUE ORIGIN 


LIBRARY/ 


HYSEQ LIBRARY 


SEQIDNOS: 




RNA SOURCE 


NAME 




adult brain 


GIBCO 


ABD003 


3940 


adult brain 


Clontech 


ABR006 


3940 


adult brain 


Invitrogeri 


ABR014 


3940 


cultured preadipocytes 


Strategene 


ADPOOl 


3937 


adult heart 


GIBCO 


AHROOl 


3940 


adult kidney 


GIBCO 


AKDOOl 


3940 


adult lung 


GIBCO 


ALGOOl 


3940 


young liver 


GIBCO 


ALVOOl 


3940 


adult ovary 


Invitrogen 


AOVOOl 


3938, 3940-3941 


adult spleen 


GIBCO 


ASPOOl 


3940-3941 


testis 


GIBCO 


ATSOOl 


3940 


bone marrow 


Clontech 


BMDOOl 


3938, 3940 


bone marrow 


Clontech 


BMD004 


3940 


adult cervix 


BioChain 


CVXOOl 


3940 


endothelial cells 


Strategene 


EDTOOl 


3940 


fetal brain 


Clontech 


FBR006 


3940 


fetal brain 


Invitrogen 


FBT002 


3940-3941 


fetal heart 


Invitrogen 


FHROOl 


3940 


fetal kidney 


Clontech 


FKDOOl 


3940 


fetal kidney 


Clontech 


FKD002 


3940 
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TISSUE ORIGIN 


LIBRARY/ 
RN A SOURCE 


HYSEQ LIBRARY 
NAME 




fetal liver-spleen 


Columbia 
University 


FLSOOl 




fetal liver-spleen 


Columbia 
University 


FLS002 




fetal liver-spleen 


Columbia 
University 


rLoUUJ 




fetal liver 


Clontech 






fetal skin 


Invitrogen 


FSKOO 1 




fetal spleen 


BioChain 


FSFOOl 




fetal brain 


Groco 


HFBOOl 


3937, 3940-3941 


infantbrain 


Columbia 
University 


IB20u2 


'IQ'^l '^O'^Q "^Od-l 


leukocyte 


GIBCO 


T TK^rtAI 

LUCUUl 




leukocyte 


Clontech 


LUC0U3 


~^04?U^Q41 


melanoma from cell line ATCC 

#CRL 1424 


Clontech 


MEL004 


^040 


mammary gland 


Invitrogen 


iVlMUUUl 


jyJIy J7"v J^^x 


neuronal cells 


Strategene 


XTTTT TAA1 




prostate 


Clontech 


TJ'D TAA 1 




rectum 


Invitrogen 






salivary gland 


Clontech 


OAT oAl 

SALS U J 




small intestine 


Plontech 


SINOOl 


3940 


skeletal muscle 


Clontech 


SKMOOl 


3940 


spinal cord 


Clontech 


SPCOOl 


3940 


thvmus 


Clontech 


THMc02 


3938 


thyroid gland 


Clontech 


THROOl 


3942 


uterus 


Clontech 


UTROOl 


3940 
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WHAT IS CLAIMED IS: 

1 . An isolated polynucleotide comprising a nucleotide sequence selected from the group 
consisting of SEQ ID NO:l-984, 1969-2952, 3937-3942 or 3949-3954, a full length protem 
coding portion of SEQ ID NO:l-984, 1969-2952, 3937-3942 or 3949-3954, a mature protein 
coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, an active domain 
coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, and complementary 
sequences thereof. 

2. . An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide hybridizes to the polynucleotide of claim 1 under stringent hybridization 
conditions. 

4 

3. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide has greater than about 90% sequence identity with the polynucleotide of claim 1. 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences,, . 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claun 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 operatively 
associated with a regulatory sequence that modulates egression of the polynucleotide in the host 
cell. 

10. An isolated polypeptide, wherein the polypeptide is selected from the group consisting of: 

(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent conditions 
with any one of SEQ ID NO: 1-984, 1 969-2952, 3937-3942 or 3949-3954. 
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11. A composition comprising the polypeptide of claim 1 0 and a carrier. 

12. An antibody directed against the polypeptide of claim 10. 

13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a complex 
with the polynucleotide of claim 1 for a period sufficient to form the complex; and 

b) detecting the complex, so that if a complex is detected, the polynucleotide 

of claim 1 is detected. 

14. A method for detecting the polynucleotide of claun 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions with 
nucleic acid primers that anneal to the polynucleotide of claim 1 under such conditions; 

b) amplifying a product comprismg at least a portion of the polynucleotide of 

claim 1; and 

c) detecting said product and tiiereby the polynucleotide of claim 1 in the 

sample. 

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the method 
further comprises reverse transcribing an annealed RNA molecule into a cDNA polynucleotide. 

16. A method for detecting the polypeptide of claim 10 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a complex 
witii flie polypeptide under conditions and for a period sufficient to form tiie complex; and 

b) detecting formation of the complex, so tiiat if a complex formation is 
detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting tiie complex, so that if the polypeptide/compound complex is 
detected, a compound tiiat binds to the polypeptide of claim 10 is identified. 
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18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a cell, under 
conditions sufficient to form a polypeptide/compound complex, wherein the complex drives 
expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence expression, so 
that if the polypeptide/compound complex is detected, a compound that binds to tiie polypeptide 
of claim 1 0 is identified. 

19. A method of producirg the polypeptide of claim 10, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected fix>mm 
the group consisting of SEQ ID NO: 1-984; 1969-2952, 3937-3942 or 3949-3954, a mature 
protein coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, an active 
domain coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, 
complementary sequences thereof and a polynucleotide sequence hybridizing xmder stringent 
conditions to SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, under conditions 
sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 

20. "'j^ isolated polypepfide comprising an amino acid sequence selected iSrom the group' 
consisting of any one of tiie polypeptides SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 
3955-3960, the mature protem portion thereof, or the active domain thereof. 

21 . The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide array. 

22. A collection of polynucleotides, wherem the collection comprising the sequence 
information of at least one of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. 

23 . The collection of claim 22, wherein the collection is provided on a nucleic acid array. 

24. The collection of claim 23, wherein the anay detects full-matches to any one of the 
polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of the 
polynucleotides in the collection. 
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26. The coUection of claim 22, wherein the coUection is provided in a computer-readable 
format 

27: A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 and a 
phaimaceutically acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition conq)rising an antibody that specifically binds to a 
polypeptide of claim 10 or 20 and a pharmaceutically acceptable carrier. 
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Pages 485 to 6221 of this application contain amino acid sequence listings. 
They can be obtained at the address given below. 

Les pages 485 to 6221 de cette demande contiennent des listages des sequences 
d'acides amines. Elles peuvent Stre obtenues d Tadresse indiqu^e ci-dessous. 



World Intellectual Property Organization 
34, chemin des Colombettes 
CH-1211 Geneve 20 
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